WHEREHOO TECHNICAL OVERVIEW
Author: Jim Youll
Last update: 2 March 2001, 16:30 EST
This version of the documentation
corresponds with version 0.700 of the
Wherehoo server.
Want to write code to talk to the Wherehoo server? Skip down to the section labeled "WHEREHOO TCP INTERFACE"
BACKGROUND, STUFF AND BOTHER
An infrastructure element of the Media Lab's Impulse project, Wherehoo is an interactive location service for software agents. It binds a link to data (e.g. the URL of a web page) or data itself (e.g. a music file) to some precise location on Earth. Agents query existing data and insert records for use by themselves or other agents. Wherehoo is equally good at indexing long-lived things such as restaurants, and short lived-things like subway trains at stations, buses at bus stops, or GPS-wired rollerbladers.
The role of Wherehoo is to faciliate location-based agent activity, not to conduct this activity on its own. Features like web page spidering, keywords and other non-location-specific data have been kept out of the system. Other servers, systems and middle agents can handle that work on their own by many different means. We wanted agents to have a variety of strategies available to them. Also, by not involving itself in protocols or data types (but by relaying information about protocols or data types represented in a record to agents), the server can support present and future applications, protocols and data formats without modification. We believe the Wherehoo server demonstrates a proper level of abstraction for an agent data service provider.
The server has a socket interface for agent interaction, the dominant mode of operation. It is written in Java with JDBC and Mark Matthews' JDBC driver. The database server is mySQL. A web-based front-end to the server is used for human interaction, demonstration, and occasional maintenance. Web pages served by the Apache httpd use PHP to interact with the database.
WHAT IS IN A WHEREHOO RECORD?
A Wherehoo record contains the coordinates (X,Y,Z or Latitude,Longitude,Height) of some location on Earth, and a binary data block representing either (1) a pointer to some data or (2) the data itself. For example, a Wherehoo record bound to the location of The Other Side Cafe may contain the URL of a review of the cafe, or it might store a JPG image of the cafe directly in Wherehoo. Small binary records are stored in the server so that simple Things That Think do not have to make arrangements with a second server via yet another protocol to host information about themselves. For example, a Kozmo.com drop box may want to report that it contains some tapes. Using Wherehoo, the box can maintain its own location and status within Wherehoo, rather than using additional protocols and servers to store state data.
Records may have "BEGIN" and "END" dates and times. A record may be created for a future, short-term event, for example. The record ordinarily appear in query results after the event's "begin" time, and stop appearing after the event's "end" time. By default, Wherehoo does not look into the future or past: only records that are active at the moment of the query are returned. The ability to look backward or forward in time was added in v0.59 to support learning agents that watch the behavior of short-lived objects over time (such as people walking, subway trains, etc). Begin/end time data are sent to Wherehoo as relative offsets from "now," rather than absolute dates/times. Wherehoo is thus resilient with regard to differing time zones and unsynchronized clocks. Each device storing data in Wherehoo need only be true to itself when setting begin and end times.
QUERY PARAMETERS AND PROCESS
Wherehoo draws a square around "location", with the centers of all edges at distance "search radius" from "location." This results in the inclusion of approximately 27% more area in the search (the area of a square of width=2R minus the area of a circle having diameter=2R). It is likely that a sufficiently-populated Wherehoo database will return some records bound to locations with a straight-line distance beyond the specified search radius. Given the intended use of the search results ("on the street") and the fact that straight-line distances to even relatively nearby locations do not precisely describe the distance one must travel to reach that place on roads and sidewalks, this should not pose a problem for agents that anticipate (or don't care about) this possible outcome. Presently all searches are conducted without regard to the height of an object. This behavior may change in future versions of the server.
RESPONSE TO QUERIES
Wherehoo answers queries by sending a series of record headers, one per line, describing data located in the search area. The records are not sorted. Each record is presented to the client via a header that describes the record, including: size of data field, whether or not there is metadata, the data's MIME type, protocol needed to retrieve the data, and the range, bearing and heading from the current location to the location represented by the record.
Once a client has the record header, it may "SKIP" the record, ask for the textual metadata describing the record (if there is any), ask for the DATA field contents, or say "BYE" to terminate the connection.
The Wherehoo server
talks on port 5859 via TCP
Currently there is one server in the world, wherehoo.media.mit.edu
In the future there may
be more, but clients will probably only need to know about one Wherehoo server
to find all the others
The communications protocol
is reminiscent of SMTP.
Each command is issued on a single line, command first, parameters after, then
a return.
A command is separated from its parameters by one or more blank spaces.
Multiple parameters for one command appear on the same line as the command and
are separated by one or more blank spaces.
Case does not matter.
If a command is sent more than once, the most recent values are used.
Sample command: LLH
42 -071 0
Unlike SMTP, there is no required sequence for commands to the server. All parameters sent by the client are analyzed together upon receipt of the client's EOD (EOD is signaled by a '.' on a line by itself). However all data needed by the server must be in place before the client sends its EOD.
Insert operations require authorization. Clients must IDT themselves if they are doing INSERTs. An authentication code is required. At present these are issued by the Wherehoo administrator. in the future they will be generated and issued automatically.
Queries do not require authentication. However, a client's IDT yields an IDT from the server, containing server operating parameters including timeouts and size limits for data fields that are useful to querying agents. All agents may benefit from knowing these limits before trying to interact with the server.
Not all commands are required for all operations. Many of the commands (particularly when querying) are optional, and default to useful values if not explicitly set by the client. When a command is sent, however, all its parameters are mandatory.
|
MESSAGES
FROM THE CLIENT TO THE SERVER
|
|
| Identification and authentication | |
| IDT identity | Identity - identifies client. Must match an IDT on file, for INSERT actions, or the insert will fail. |
| SHA signature | SHA-1 message digest used as a signature over data |
| Query and record insertion | |
| ACT QUERY | Action: Run a query and return the matching records |
| ACT COUNT | Action: Run a query and return the number of matching records |
| ACT INSERT | Action: Insert a record into the database and return the uniqueid of the record. |
| ACT DELETE | Action: Delete a record from the database and return either "ACK" for success or "NAK" for failure. Records may only be deleted if the IDT of the creator and the UID of the record are presented and match the values that were stored in the record when it was created. |
| LLH L1 L2 L3 |
Current location in Lat/Lon/Height
coordinates (type double) |
| XYZ L1 L2 L3 | Current location in X/Y/Z coordinates (type double) |
| Specifically for queries | |
| BEG
yy mo dd hh mi ss |
Return records that were alive on or after this offset from NOW (NOW is calculated at the moment the query is run). If specifying BEG in a query, always use END as well or results may be confusing. Offsets may be positive or negative. If BEG and END are not specified (they should be used only when really necessary) the records returned are those that are alive at the moment the query is run, that is, "now" |
| END yy mo dd hh mi ss |
Return records that were alive on or before this offset from NOW (NOW is calculated at the moment the query is run). If specifying END in a query, always use BEG as well. Offsets may be positive or negative. If BEG and END are not specified (they should be used only when really necessary) the records returned are those that are alive at the moment the query is run, that is, "now" |
| HDG heading |
Heading for directionally-focused searches (0.0 <= heading <= 360.0, type double). Default is 0°=due North. As of version 0.70, LIM is not compatible with a directional search. Do not use LIM and HDG together. |
| LEN length | If not using RAD, search length in meters (0° and 180° at current heading, type integer) |
| LIM limit |
Limit maximum number of records
returned (type integer). limit must be >= 0. As of version 0.70, LIM is not compatible with a directional search. Do not use LIM and HDG together. |
| MET metadata | Return
only records with the substring metadata anywhere in their META fields.
Matches are case-insensitive. Use sparingly. Records with no metadata, and those with similar but non-exact matches, will not be returned when this parameter is included. MET is quite useful when looking for a set of records tagged with similar metadata (e.g. you know something was inserted with specific metadata, and only want those records) |
| MIM mimetype |
Return only records whose header
indicates they are exactly this MIME type (full-string match is performed) |
| PJT heading range | "Project" the user's location from the stated Latitude and Longitude (LLH command) to a new place at a position that is "range" distance from the current Lat,Lon at the given "heading". This is useful for things that need to pretend they are at a place "away from" the current location. The internal calculation is as precise as (and derived from) the other distance and heading calculations built into the server. (-360.0 < heading < 360.0) |
| PRO protocol |
Return only records whose protocol
is exactly protocol If this parameter is not sent
with the query, records with all protocols are returned The table of protocols is maintained
manually. |
| RAD radius | Search radius, in meters (0°, 90°, 180° and 270° around current location at current heading, type integer). Sets WID = LEN = radius. Alternatively, use WID and LEN to specify a non-square search area. |
| SHP shapecode | Shape of directional search {rect_ctr | rect_fwd} |
| WID width | If not using RAD, search width in meters (90° and 270° at current heading, type integer) |
| META | After receiving descriptive data about a Whereoo record, client wants the record's metadata |
| DATA | After receiving descriptive data about a Whereoo record, client wants the actual data block |
| SKIP | After receiving descriptive data about a Whereoo record, client does not want the data block. Skip to next record |
| Specifically for record insertion | |
| BEG yy mo dd hh mi ss | Offset
from NOW for the beginning date/time of this record, defaults to 00/00/0000
if not specified "NOW" is calculated at the moment the record is inserted or updated (when server receives client's EOD marker) |
| DAT data_size |
Here comes the binary data. The client should send: Here is a class that holds the data block and provides method SHAhash(mysecret) to return the SHA-1 signature:
The total bytes your client
sends must be exactly data_size + 20 Immediately after sending the
'DAT data_size' line, send your bytes of data plus the 20-byte signature,
all as one contiguous binary block. Then go back to line-reading and watch
for "ACK". The server will answer with an ACK after it's received
your data_size bytes plus the 20-byte signature. WHERE DO I GET AN IDENTITY
AND SECRET SO I CAN ADD RECORDS? |
| END yy mo dd hh mi ss | Offset
from NOW for the ending date/time of this record, defaults to 12/31/9999
if not specified "NOW" is calculated at the moment the record is inserted or updated (when server receives client's EOD marker) |
| MET metadata |
Text metadata describing the
thing pointed to by this record. |
| MIM mimetype |
MIME type of the data pointed
to by (or contained in) this record. NOTE:
Beginning with version 0.630, 11-1-00, MIME types are no longer
validated against a list. We still recommend using well-known MIME types
(else nobody will be able to use your records)!
|
| PRO protocol |
The protocol an agent will
use to retrieve the thing pointed to by this record. |
| Specifically for record deletion | |
| UID uniqueid | The 32-character uniqueid of the record that is to be deleted. This information was reported to the creating client at the time the record was created and cannot be retrieved otherwise. Clients that want to delete or update their records will need to store and manage the records' UIDs. |
| Connection control | |
| DBG debug_mode | Mode
of debugging {on | off }. Default is OFF. When ON, state info is sent back
toward you. Don't use unless debugging the server or your communications stream will be full of unwanted goo |
| NOP |
No-op to forestall rx timeout
by server if your client is running slowly or to see if server is still
listening |
| BYE | Close our connection now |
| . | EOD. Client is finished sending data to server. Server should run the query now. |
| wherehoo_server version timeout min_expiration max_meta max_data |
Server's response
to an IDT from the client. All fields are variable length, space-delimited wherehoo_server is a
constant text string |
| OK | Search parameters accepted. Server is running query or insert operation. |
| NAK command_list | These commands had invalid or out of bounds parameters. Query not running. |
| ACK | Response to a NOP. Server is alive and listening. |
| . | EOD (End of data). There are no more records to send you. |
| BYE | Server is closing the connection |
|
DATA
FROM THE SERVER TO THE CLIENT
|
|
|
In response to a query from the client, the server sends back individual records.
Record descriptive data appears
on one line, space-delimited, in this format: |
|
| bearing | Bearing from your location to this object, in degrees, at your current HEADING |
| compassdirection | Compass direction from your location to this object {N | NE | E | SE | S | SW | W | NW } |
| distance | Straight-line distance from your location to this object, in meters |
| ttl | Time-to-live of this record
in the database, in seconds TTL longer than 99999999 seconds, just over 3 years, come back as 99999999. |
| bytes | Number of bytes of data in the object's data block (to be sent if client replies DATA) |
| protocol | If object's data block is a
URI, the protocol to be used for fetching the data If object's data block is stored in the Wherehoo server, the protocol is "WHEREHOO" |
| mimetype | The MIME data type of the data represented by the object's data block, whether served directly from WHEREHOO or some other server (protocol tells you where it's coming from) |
| meta | "META" if textual
metadata exists for this record, "NONE" if no metadata exists Client replies META to tell server to send this text. |
|
SAMPLE
CLIENT-SERVER INTERACTION - A Query
|
|
| From the client to the server | From the server to the client |
| IDT jim | |
| wherehoo_server 0.56 30 20 1024 65535 | |
| ACT QUERY | |
| LLH 72.019 288.908 0 | |
| RAD 250 | |
| LIM 10 | |
| . | |
| OK | |
| 96 N 1443 99999999 33212 WHEREHOO IMAGE/JPG META | |
| DATA | |
| (33212 bytes of data)138 NW 230 99999999 74 HTTP TEXT/HTML NONE | |
| DATA | |
| (74 bytes of data are sent) | |
| . | |
| BYE | |
|
SAMPLE
CLIENT-SERVER INTERACTION - An insert
|
|
| From the client to the server | From the server to the client |
| IDT jim | |
| wherehoo_server 0.56 30 20 1024 65535 | |
| ACT INSERT | |
| LLH 72.019 288.908 0 | |
| MIM TEXT/PLAIN | |
| PRO WHEREHOO | |
| DAT 12 | |
| (12 bytes of data) | |
| ACK | |
| MET This is some metadata describing the content of the DAT field | |
| . | |
| OK | |
| c08b9ac6d59dba9be0106096626809b2e4098f0a | |
| . | |
| BYE | |
|
SAMPLE
CLIENT-SERVER INTERACTION - A delete
|
|
| From the client to the server | From the server to the client |
| IDT jim | |
| wherehoo_server 0.56 30 20 1024 65535 | |
| ACT DELETE | |
| UID c08b9ac6d59dba9be0106096626809b2e4098f0a | |
| . | |
| OK | |
| ACK | |
| . | |
| BYE | |
|
Hard
limits
|
|
| Maximum RAD, WID, or LEN | MAXWIDLEN = 999999 |
| Maximum length of identity for IDT | MAXIDT = 10 |
| Unique id string returned after a successful INSERT action | 40 hex characters(0..F) Fixed length |
| Things to think about | |
| Move the distance calculation functions into MySQL (make custom functions) so they can be used in queries. | The querying software must have the distance functions in it... but there are several query frontends (PHP, Perl, Java, whatever) |
| Self-determination of coverage area | How does a network of Wherehoo servers self-configure? |
| Query routing and server acquisition of geodata | |
| Revision History | |
| 0.630 11/1/00 |
Removed checking of MIME type
presented with the MIM command
This query only returns records that point to web pages accessible via the HTTP protocol |
| 0.623 | Catch datalen > MAXDATA and return DAT error |
| 0.621 10/29/00 | Revised diagnostic
byte printouts, solving the 1429-byte-problem revisions to DAT read section, problem using .read(,,) switched to .ready() plus .read() with a wait loop when anticipating more data |
| 0.610 10/8/00 | Added some diagnostics. nothing major. |
| 0.602 10/6/00 | Add PJT command to project self to some other place (for periscope, other lookahead apps) |
| 0.590 10/1/00 | Allow BEG, END offsets to expand search window beyond records that are alive "now" |
| 0.580 9/3/00 | Allow META for queries, match meta to any substring in the meta field |
| 0.570 9/17/00 | Add SHA1 signature as 20 bytes appeneded to DAT bytes. Checks signature by generating SHA1 sig against asserted IDT's secret |