Technical Specification of File Sharing
Community (1st draft)
Content:
1. Shared File Locations
2. Configuration Files
3. Script Files for Batch Mode
To uniquely identify a shared file provided by some community member, we must provide a mechanism to uniquely represent the location of that file. Our proposal is rather simple and is described as follows:
1. Host addresses
The address of a community member is represented by its IP or domain name.
2. The FSC root directory of a community member
All shared files of a community member are stored under the FSC root directory. There might be subdirectories under this root directory. The root directory is a configurable parameter defined in “fsc_daemon.cfg” (ref. Configuration files). For example, in WIN32 platform the root directory might look like “c:\tmp\share\” while in UNIX-like platform it might look like “/tmp/share/.”
3. Shared file locations
The location of a shared file is composed of the host address of the community member who provides the shared file and the path of the shared file, which is relative to the FSC root directory. For example, let Ha(IP: 10.1.168.97) be a community member whose FSC root directory is “/tmp/share/” (or “c:\tmp\share\” for WIN32 platform). Then the location of the shared file – “top_secret.txt” – stored in the FSC root directory will be represented as “10.1.168.97/top_secret.txt”. That is, other community members could use this location to fetch the “/tmp/share/top_secret.txt” (or “c:\tmp\share\top_secret.txt” for WIN32 platform) owned by Ha.
According to the requirement described in the MRD, we divided configurable parameters into two categories: one is for the directory service and the other is for the FSC daemon.
1. The configuration file for the directory service:
a. Filename: “dir_service.cfg”
b. Contents: refer to Table A.
Table A
|
Name |
Description |
Value |
|
FileInfoCacheSize |
The size (in bytes) of cache that is used to store information about remote shared files. Default value: 1M (1024K) |
Non-negative integer. eg. 10240, 1024K, 512M, etc. |
|
ShareFileInfo |
A community member can refuse to provide shared file information to others by setting this parameter to “No.” Default value: Yes |
Yes/No |
|
MemberInfoCacheSize |
The size (in bytes) of the cache that is used to store membership information. Default value: 512K |
Positive integer. |
|
ShareMemberInfo |
A community member can refuse to provide the membership information it collected to others by setting this parameter to “No.” Default value: Yes |
Yes/No |
|
ActiveInfoExchg |
A community member can actively exchange information (about shared files and membership) to other community members. When
it is set to “Yes”: Actively sends information to other members; updates its local caches according to information received from others. When
it is set to “No”: No information is sent actively; information received from other members is ignored. Default value: No |
Yes/No |
|
InfoExchgPeriod |
The time period in seconds that the directory service actively multicasts its information. This parameter has no effect when “ActiveInfoExchg” is set to “No”. |
Non-negative integer. |
Example:
The content of “dir_service.cfg” might be looked like:
FileInfoCacheSize 1024K
MemberInfoCacheSize 512K
ShareFileInfo No
ShareMemberInfo Yes
ActiveInfoExchg Yes
2. The configuration file for the FSC daemon:
a. Filename: “fsc_daemon.cfg”
b. Contents: refer to Table B.
Table B
|
Name |
Description |
Value |
|
FSCRootDir |
The path of FSC root directory. If this parameter is not set, no file on the host is shared to other community members. Default value: None |
String. |
|
MasterQuerySite |
The host that is first contacted with when the FSC daemon starts up. Default value: localhost |
IP or domain name. |
|
DefQueryTimeout |
The default timeout value for each query command. Default value: -1 |
Integer. –1 means “forever”. |
|
DefQueryMaxResult |
The default maximum number of results of each query command. Default value: 1000 |
Positive integer. |
|
DefSearchTimeoute |
The default timeout value for each search command. Default value: -1 |
Integer. –1 means “forever”. |
|
DefSearchMaxResult |
The default maximum number of results of each search command. Default value: 1000 |
Positive integer. |
|
DefFetchTimeout |
The default timeout value for each fetch command. Default value: -1 |
Integer. –1 means “forever”. |
|
DefFetchedFileDir |
The default directory where fetched files are stored. If this parameter is not set and no directory is explicitly specified, the FSC daemon will fail to fetch files. Default value: None |
String. |
Example:
The content of “fsc_daemon.cfg” might be looked like:
FSCRootDir /tmp/share/
DefFetchedFileDir /tmp/fetched_files/
MasterQuerySite 192.168.33.57
DefQueryTimeout 50
DefQueryMaxResult 500
DefSearchTimeout 60
DefSearchMaxResult 100
DefFetchTimeout 40
The format of batch mode script files is rather simple. The general format looks like:
# one-line comment
parameter1,1
= value1,1
parameter1,2
= value1,2
…
command1
…
parametern,1
= valuen,1
parametern,2
= valuen,2
…
commandn
According to the MRD, there are currently only three kinds of commands: query, search, and fetch. To provide maximum flexibility in the batch mode, we propose that the hosts to be queried, the file patterns to be searched, and so on, are listed in input files. For example, the input file for a query command is a list of hosts while the input file for a search command is a list of file patterns. The contents of output files are the results of corresponding commands. For example, the contents of the search output file is a list of file locations. Note that the output file of one command might be treated as an input file of another command (consider the output file of search and the input file of fetch).
Table C
|
Name
|
Description |
Value |
|
Parameters
for Query: |
|
|
|
QueryInputFile |
This parameter specifies the input file of the following query command. The content of this input file is a list of hosts to be queried. |
String. |
|
QueryTimeout |
This parameter specifies the timeout value of the following query command. If this parameter is not specified, “DefQueryTimeout” defined in “fsc_daemon.cfg” will be used. |
Integer. –1 means “forever”. |
|
QueryMaxResult |
This parameter specifies the maximum number of query results of the following query command. If this parameter is not specified, “DefQueryMaxResult” defined in “fsc_daemond.cfg” will be used. |
Non-negative integer. |
|
QueryOutputFile |
This parameter specifies the output file of the following query command. |
String. |
|
Parameters
for Search: |
|
|
|
SearchInputFile |
This parameter specifies the input file of the following search command. The content of this input file is a list of file patterns to be searched. |
String. |
|
SearchTimeout |
This parameter specifies the timeout value of the following search command. If not specified, “DefSearchTimeout” defined in “fsc_daemon.cfg” will be used. |
Integer. –1 means “forever.” |
|
SearchMaxResult |
This parameter specifies the maximum number of search results of the following search command. If not specified, use “DefSearchMaxResult” defined in “fsc_daemon.cfg.” |
Non-negative integer. |
|
SearchOutputFile |
This parameter specifies the output file of the following search command. |
String. |
|
Parameters
for Fetch: |
|
|
|
FetchInputFile |
This parameter specifies the input file of the following fetch command. The content of this input file is a list of file locations to be fetched. |
String |
|
FetchTimeout |
This parameter specifies the timeout value of the following fetch command. If not specified, usr “DefFetchTimeout” defined in “fsc_daemon.cfg.” |
Integer. |
|
FetchOutputFile |
This parameter specifies the output file of the following fetch command. |
String. |
|
FetchedFileDir |
This parameter specifies the directory under which fetched file are to be placed. |
String |
The
Content of Input/Output Files of Commands
1. The query commands:
a. The input file of the query command is a list of hosts, one host per line. A host could be represented by its IP or domain name. For example, let “file_a.qry” be an input file of the query command, the content of this file might be looked like:
10.1.168.99
192.168.33.97
fakecompany.com.tw
140.113.123.213
b. The output file of the query command is a list of file locations, one location per line. For example, let “file_b.res” be the output file of the query command, the content of the output file might be looked like:
10.1.168.99/mp3/southpark.mp3
10.1.168.99/mp3/slamdunk.mp3
192.168.33.97/movie/castaway1.dat
fakecompany.com.tw/fake_doc/fake_secret.doc
fakecompany.com.tw/prog/twno1.c
140.113.123.213/haha.c
140.113.123.213/hehe.h
2. The search command
a. The input file of the search command is a list of file patterns, one pattern per line. For example, let “file_c.srch” be the input file of the search command, the content of the input file might be looked like:
*.mp3
*.dat
*.c
*.doc
*.h
b. The output file of the search is the file where the search results are stored in. The format of the content is the same as that of the query output file.
3. The fetch command
a. The input file of the fetch command is a list of file locations, one location per line. The format of the fetch input file is the same as that of the query output file (or search output file).
b. The output file of the fetch command is the file where the fetch results are stored. The content of the file indicates whether a given file, which was specified in the fetch input file, was fetched or not.
The
Semantics of Timeout and MaxResult in Batch Files
There are various Timeouts and MaxResults defined in the previous section. But we have not described the exact meaning of these parameters. Consider the following script:
QueryInputFile = “file_a”
QueryTimeout
= 60
QueryMaxResult
= 200
QueryOutputFile
= “file_b”
query
and the content of “file_a”:
hostA
hostB
When the script is being executed, the FSC daemon will issue 2 query requests, one to hostA and the other to hostB. The QueryTimeout and QueryMaxResult are applied to individual request. That is, the timeout value of the request to hostA is 60 seconds; the timeout value of the request to hostB is also 60 seconds. In all, there might be, at most, 400 results in the output file – “file_b.” The same semantics also applies to other Timeouts and MaxResults.
Example
of Script Files
Let the content of the an input file, “query_input”, look like:
fakehost.com.tw
10.1.192.168
We can write a script file to query the shared files provide by these two hosts and then fetch those files.
# This is a comment
QueryInputFile = “query_input”
QueryTimeout = 60
QueryMaxResult = 300
QueryOutputFile = “query_output”
query
# Fetched files are to be placed under DefFetchedFileDir
(ref. fsc_daemon.cfg)
FetchInputFile = “query_output”
FetchTimeout = 300
FetchOutputfile = “fetch_output”
fetch