Kalkati.net, XML database dump

 

1 Introduction

The purpose of this document is to act as a technical guide to the Kalkati.net XML representation of Matka.fi timetable data, also known as the timetable dump file. The dump file represents the line and timetable data of the Matka.fi timetable database exported into a single file. The dump file facilitates the interchange of timetable data between Matka.fi and external parties.

Kalkati.net is the name of the XML format used in the dump file. The purpose of the Kalkati.net timetable data transfer format is to unify timetable data presentation between different systems. In this document is described how the Kalkati.net XML format is applied to Matka.fi timetable data. The main goal of this documentation is to provide a developer with the essential knowledge of how to utilize the dump file data in building new applications. It should be noted that this guide does not try to act as complete specification or documentation of the dump file but rather a technical guide to it.

2 Getting started

2.1 Links

2.2 Dump file

The dump file represents the line, route, and timetable data of the Matka.fi timetable database exported into a single file. As mentioned in the introduction, the role of the dump file is to enable easy interchange of data between Matka.fi and external transport providers and to enable third party developers to utilize the timetable data.

In technical level, the dump file is basically a zipped XML file which contains a variety of different data artifacts. The XML of the dump file is based on the standard Kalkati.net schema. The dump file is generated from the database data once every week. The generation date and validity period of the dump file is visible in the flag.txt file (located in the same directory). The dump file is downloadable from the Matka.fi server by authorized users. The URL of the file is listed in chapter “2.1 - Links”. Downloading the file requires the use of a password. The password may be obtained from Liikennevirasto.

2.3 Matka.fi

Matka.fi (or Journey.fi in English) is an online travel guide for public transport in Finland. Matka.fi allows users to search for the best public transport connections between two selected locations. The service includes all buses, trams, metro, commuter trains, and ferries. Matka.fi suggests the most suitable connections available at the given time.

www.matka.fi

2.4 Kalkati.net

Kalkati.net is a transport and cargo standards project initiated by the Finnish Ministry of Transport and Communications. It is basically a set of standards aiming to standardize the communication between Finnish public transport operators. Kalkati.net standards benefit also cargo companies. The main goal of the standards is to allow companies to exchange data more easily. This is expected to lead in a better collaboration in the field of Finnish transport operators and companies. The standards include a set of process diagrams specifying the communication patterns of how to exchange data, and a set of XML Schemas specifying the structure of the exchanged XML documents.

2.5 XML schema

XML Schema is a standard by W3C that is used to specify data models of XML documents. In a schema document is specified the elements and data types that may be used in the instance documents (in documents that are based on the schema).

Below is listed some links for a developer to get started with XML Schema. The links include the official XML Schema specification, references and tutorials.

2.6 Useful tools

XML documents, especially XML Schemas, are often difficult to read as plain text files. To get a more convenient view of a XML/Schema document, one may use an XML visualization tool. Below is listed some (freeware) tools for this purpose.

3 Kalkati.net XML File Structure

This chapter serves as a guide to the Kalkati.net XML Schema. In this chapter we explain the most important elements of the schema and their attributes. This chapter applies also to the structure of the Matka.fi dump file. This is because the dump file is an instance document of the Kalkati.net schema. Thus, this chapter is a guide both to the Kalkati.net schema and the dump file’s structure.

We do not intend to give a thorough explanation of all elements and attributes of Kalkati.net XML here but to provide a short overview of them. The exact details about the use of elements and their attributes can be always found in the Kalkati.net XML Schema document. Get the latest dump file and schema from the URLs specified in chapter “2.1 - Links“.

Reading instructions

“Occurrences” means how many times an element has to (or is allowed to) occur in the document. The possible values for occurrence are 0..1 (at most once), 1..1 (exactly once), 0..* (any number of times including zero), 1..* (at least once). In the “attributes” -section of element definition, the optional attributes are marked with “optional” text. If there is no “optional” then the attribute is always used. In the “notes” sections some special instructions may be given about the dump file’s interpretation or how to deal with some specific issues of Kalkati.net XML in general.

3.1 Overview

Figure 1 presents the structure of the dump file visualized in Netbeans XML Tool. In Appendix B presents the same but visualized in XMLSpy tool.

 Figure 1: Structure of the dump file

3.2 <jp_database>

<jp_database version=’1.0’>
<!-- elements here -->
</jp_database>

Occurrences: 1..1

Description: Is the root element of the XML document.

Attributes:

3.3 <Delivery>

Occurrences: 1..1

Description: Gives the period of time over which the following timetables are valid and identifies the providing company. Unless otherwise noted, the first date in a footnote vector is the date of the delivery period’s start.

Attributes:

3.4 Company

<Company CompanyId=’1’ Name=’YTV’ Time=’0000’ Code=’YTV’/>

Occurrences: 1..*

Description: Defines a timetable provider or transport operator that provides transport services..

Attributes:

Notes: Time should never be used in routing calculations – it's role is purely informational. The companies in the <Company> elements involve not only the main transport operators but also the subcontractor operators. Name is always the original Finnish name of the company. It is possible that also <Company> elements have synonyms (translations).

3.5 <Country>

<Country CountryId=’fi’ Name=’Suomi’ Inland='1'/>
<Country CountryId=’se’ Name=’Ruotsi’/>
<Country CountryId=’ru’ Name=’Venäjä’/
<Country CountryId=’en’ Name=’Englanti’/>

Description: Defines a country that is involved in the in the transport service.

Attributes:


Notes: In the dump file, <Country> is always Finland. CountryId values are always in lower case. Name values are always in the default language (Finnish).

3.6 <Timezone>

<Timezone TimezoneId=’1’>
<Period Firstday=’1970-01-01T00:00:00.0+00:00’ Lastday=’2100-01-
01T00:00:00.0+00:00’ Difference=’+02:00’/>
</Timezone>

Occurrences: 1..*

Description: Defines a time zone that is in use in the transport service area. The time zone details are defined in the <Period> child elements.

Note: In Matka.fi data timezone is not used to handle winter/summer time. Therefore Matka.fi data is always in +2 timezone.

Attributes:

3.6.1 <Period>

Occurrences: 1..*

Description: Defines the offset from the mean time (UTC+DST) during a period of time.

Attributes:


Notes: In the dump file the time zone is always UTC+DST+02:00. Currently DST is +1 hours between the last Sunday of March 01:00 (UTC) and the last Sunday of November 01:00 (UTC). Note that the periods should never overlap!

3.7 <Language>

<Language LanguageId=’fi’ Description=’Suomi’ isDefault=’true’/>
<Language LanguageId=’sv’ Description=’Ruotsi’/>
<Language LanguageId=’en’ Description=’Englanti’/>
<Language LanguageId=’ru’ Description=’Venäjä’/>

Occurrences: 1..*

Description: Defines a language used in the service.

Attributes:


Notes: <Language> elements are analogous to <Country> elements. LanguageId is normally two letters but may be also three. LanguageId is always written in lower case. Note Swedish language is “sv”, not “se”!

3.8 <Station>

<Station StationId='2241206' Name='Mankkaankallio' Minchangetime='0' TimezoneId='1' CountryId='fi' city_id='2' X='2542631.0' ='6676314.0' type='0'/>

Occurrences: 0..*

Description: Defines a station or stop in the service area.

Attributes:

3.9 <Trnsattr>

<Trnsattr TrnsattrId=’1001’ Name=’Makuupaikka’/>
<Trnsattr TrnsattrId=’2001’ Name=’Makuupaikka, 1. luokka’ AttrType=’1001’/>
<Trnsattr TrnsattrId=’2002’ Name=’Makuupaikka, 2. luokka’ AttrType=’1001’/>

Occurrences: 0..*

Description: Defines a transportation attribute. These elements are used to provide additional (free-form) information about services (<Service> elements).

Attributes:


Notes: Service elements refer the <Transattr> elements through the <ServiceAttribute> elements. TrnsattrIdattributes
are always intergers. Name is always in Finnish. The Attrtype global codes may be found in a
database or code specifications (such as in Trident Project).

3.10 <Trnsmode>

<Trnsmode TrnsmodeId=’6’ Name=’Metroliikenne’/>

Occurrences: 0..*

Description: Defines a single transport mode (vehicle type) used in the transport service area.

Attributes:

Notes: Modetype-attributes are always integers. Name is always in Finnish.

3.11 <Synonym>

<Synonym LanguageId=’sv’>
<Station StationId='6020215' Name='Masaby bibliotek'/>
<Trnsmode TrnsmodeId='1' Name='Buss'/>
<!-- more.. -->
</Synonym>

Occurrences: 1..*

Description: <Synonym> elements are used for providing translations of texts in other elements. A <Synonym> element is actually only a container element, the translations are given as child elements for it. The child elements are always of form <Element id=’key’ name=’translated text’/>. The Element may be one of the following: Language, Company, Country, Station, Trnsattr, or Trnsmode. The id attribute refers to the ID of the element that has been translated. The name attribute, which is the actual translation, overrides the value in the original element.

Attributes:

Notes: In the case of <Language> translation attribute Description is used instead of the Name attribute. This is because the <Language> element has no name attribute.

3.12 <Thrusrvc>

Occurrences: 0..*

Description: DEPRECATED - Enables the definition of new services by combining the existing services (<Service> elements).

Notes: The element is deprecated and is not in use in the dump file. <Change> elements are used now instead of <Thrusrvc> because they are more easily interpreted.

3.13 <Change>

<Change ServiceId1=’123’ ServiceId2=’345’ userVisible=’true’/>
<Change ServiceId1=’432’ ServiceId2=’455’/>

Occurrences: 0..*

Description: Defines a possible change from the stop of one service to that of another. It merges two services into one so that the last stop of the first service is linked to the first stop of the second service. With <Change>, the changes are always guaranteed.

Attributes:

Notes: The passenger should be notified that the second service waits for the first service in case the service is guaranteed in guaranteed attribute.

3.14 <Timetbls>

<Timetbls>
  <Service ServiceId='66842'>
  <!-- service contents -->
  </Service>
  <!-- more services -->
<Timetbls>

Occurrences: 0..1

Description: The role of the <Timetbls> element is to act as a container element for the timetables in <Service> elements (see below).

Attributes: none

3.14.1 <Service>

<Service ServiceId=’28695’>
  <ServiceNbr CompanyId=’3668’ ServiceNbr=’1024 1’ Variant=’24’ Name=’Erottaja - Seurasaari’/>
  <ServiceValidity FootnoteId=’10’/>
  <ServiceTrnsmode TrnsmodeId=’21’/>
  <ServiceAttribute AttributeId=’743’ FootnoteId=’27’/>
  <Stop Ix=’1’ StationId=’1030130’ Arrival=’0550’/>
  <Stop Ix=’2’ StationId=’1020174’ Arrival=’0551’/>
  <Stop Ix=’3’ StationId=’1040128’ Arrival=’0552’/>
  <Stop . . . />
</Service>

Occurrences: 0..*

Description: Defines information about a single departure (= a service in Kalkati.net). The <Service> elements of <Timetbls> represent the actual (all) timetable data in the system. <Service> elements appear only as child elements of <Timetbls>.

Attributes:

Contained elements:

Notes: In the dump file, the FootnodeId of <ServiceAttribute> always references the same <Footnote> as the FootnodeId in <serviceValidity>.

3.15 <Footnote>

<Footnote FootnoteId='27' Vector='11111101111110111110011111101111110111111011111' Firstdate='2008-06-02'/>

Occurrences: 1..*

Description: Defines a validity period. Footnotes are used by <ServiceValidity> and <ServiceAttribute> elements in <Service>. Validity periods are specified as a set days where the timetable is either valid or not.

Attributes:

4 Questions and answers

4.1 Keys and references

Q: Can you explain the XML Schema keys and references?

A: Keys are used as the unique IDs of elements. For example, serviceId attribute of the <Service> element is a unique ID (a key) of that element. A reference is a relation between elements A and B. The key (or ID) of element B is specified in the reference attribute of element A. For example, element Stop references element Station through StationId attribute. Now element A is said to be in a relation with element B and is linked with its data. References in XML Schema are analogous to relations in relational databases.

4.2 Local/Global keys

Q: What is meant by local and global keys?

A: The keys in Kalkati.net XML are divided into three categories based on their scope (namespace): local keys, global keys, and provider specific keys. The key scopes are defined in the <key> definitions section in the schema. A single key's scope can be seen from the <key> name attribute. The <key> names follow a pattern of [ElementType][Scope]Key. For example, a key CompanyLocalKey defines CompanyId key is defined in a local scope.

Local scope means a key is unique only in that current data file. For example, ServiceLocalKey means the ServiceId's are only valid in that document it has been defined in. Thus, the ID's have no meaning in any other contexts.

Global scope defines a key is a commonly recognized code that may have been defined in some universal code specification. For example, the keys of <Country> elements come from the ISO specification for country codes.

Provider (specific) scope means a key is unique to the timetable data provider only. For example, StationProviderKey means the StationId is unique ID for a station only within the data of that timetable data provider.

4.3 Departure notes

Q: Where are the departure specific notes (e.g. low-floor bus) specified in the XML data? Where are the short codes (e.g. M for low-floor bus) specified.

A: The special notes are known as transport attributes in Kalkati.net XML. They are specified in the <TrnsAttr> elements. A transport attribute is linked to a departure (service) via the attributeID attribute of the <ServiceAttribute> element (a child element of <Service>).

Unfortunately, in Kalkati.net XML there is no such specific attribute where the short codes could be specified. However, it is possible to overcome this limitation by using the short codes as the TrnsAttrId attributes.

4.4 Special days

Q: How are the departures of special days presented in data?

A: There is no concept of a special day in Kalkati.net XML. All days are basically equal to each other. In the data, if a service is not being operated on a special day such as Christmas day, it is indicated in the corresponding <Footnote> bit vector string of the service (see <Footnote> for details). Instead, if there is a service that is being operated only on a special day and with a special timetable, such a service is specified in an extra <Service> element of its own.

4.5 Subtitute timetables


Q: Is it possible to have substitute timetables for the timetables of weekdays that are actually holidays? For example, to use Sunday timetables on Easter days?

A: Not really. In Kalkati.net XML there is no such a concept as a substitute timetable. Special timetables have to be always specified in their own <Service> elements. It is possible, however, to define a single timetable for a line that is valid throughout the year on all holidays. This is done using a <Footnote> that specifies the timetable is valid only on the holiday days. The bit vector string of the <Footnote> would look something like: 10000000000000000000000100000000000000000010.. (= with a lot of zeros for regular days).

4.6 Element Order in the XML File

Q: Are the elements always in same order in Kalkati.net XML file?

A: Yes. This is because if an element in Kalkati.net XML element groups always appear in sequences. The sequence indicator in XML Schema means that the elements must always appear in the same order they are declared in a sequence. In general, parsing of sequenced elements is often much faster than parsing of unordered elements.

It is notable that this applies also the <Synonym> element. The different elements must be grouped so that that the groupings appear in the same order as in the rest of the file. However, within the groupings individual elements can be in any order.

4.7 Over-midnight Departures

Q: How are the over-midnight departures presented in Kalkati.net XML?

A: In Kalkati.net XML the departure times in <Stop> elements are specified using a 32 hour clock. This means there may be buses departing from a stop, e.g., at 24:15 hours (= 0:15 am). In terms of Kalkati.net XML, thus, there may not be departures (services) that run through the next day. Below is an example of the representation of an over-midnight departure (service).

<Service ServiceId="27">
    <ServiceNbr ServiceNbr="540" Name="Espoo - Ikean liittymä – Helsinki- Vantaan lentoasema" CompanyId="2" />
    <ServiceValidity FootnoteId="4" />
    <ServiceTrnsmode TrnsmodeId="1" />
    <ServiceAttribute FootnoteId="4" AttributeId="DT.s"/>
    <ServiceAttribute FootnoteId="4" AttributeId="NT.27"/>
    <Stop Ix="1" StationId="2987743" Departure="2350"/>
    <Stop Ix="2" StationId="2345553" Departure="2352"/>
    <Stop Ix="3" StationId="2837456" Departure="2400"/>
    <Stop Ix="4" StationId="2223048" Departure="2403"/>
    <Stop Ix="5" StationId="2304894" Departure="2405"/>
    <Stop Ix="6" StationId="2008737" Departure="2408"/>
    <Stop Ix="7" StationId="1439457" Departure="2410"/>
    <Stop Ix="8" StationId="1483633" Departure="2413"/>
    <Stop Ix="9" StationId="1098744" Departure="2417"/>
    <Stop Ix="10" StationId="3943774" Departure="2421"/>
    <Stop Ix="11" StationId="3303008" Departure="2424"/>
    <Stop Ix="12" StationId="3098573" Departure="2425"/>
</Service>

Appendix B

In the following table is presented the “Trident project codes” for different vehicle types (transport modes).

 

Code Description
1 air
2 train
21 long/mid distance train
22 local train
23 rapid transit
3 metro
4 tramway
5 bus, coach
6 ferry
7 waterborne
8 private vehicle
9 walk
10 other