Skip to contents

The Mobility Database catalogs several types of transit and mobility feeds. While GTFS Schedule feeds (formerly GTFS Static) make up the majority of entries, they represent only part of the mobility data landscape. Knowing when a bus is scheduled to arrive is useful, but knowing whether it will actually arrive on time is even more valuable.

The Database also contains GTFS Realtime and GBFS feeds, which provide real-time operational data about transit systems and shared mobility services.

As mobdb consumes data from the Mobility Database API, a number of the functions in this package will provide valid outputs of GTFS-Realtime and GBFS objects. However, it’s important to note that support for GTFS-Realtime and GBFS in mobdb should be considered experimental at this time and is limited only to feed discovery.

GTFS-Realtime data structure

GTFS-Realtime, as of the time of writing, supports four distinct feed entities:

  • TripUpdates - fluctuations in the timetable - “Bus X is delayed 2 minutes”
  • ServiceAlerts - problems/issues with an entity - “Stop Y is closed”
  • VehiclePositions - vehicle location and sometimes speed - “This bus is at position X at time Y”
  • TripModifications - detours that affect a set of trips - “On X weekend, XYZ trips are on detour”

A detailed overview of what each feed entity does is available on the official GTFS reference.

GTFS-Realtime data is encoded and decoded as Protocol Buffers. In R, Protocol Buffers can be interpreted using the RProtoBuf package.

How mobdb handles GTFS-Realtime

mobdb, whether through feeds() or mobdb_search() will surface all available GTFS-Realtime feeds on the Mobility Database. The URL retrieved from mobdb_feed_url() will be the protobuf endpoint url.

Note: Many GTFS-Realtime producers will require authentication to access their endpoints. Use the source_info data frame to get authentication information from the Mobility Database.

# TransLink Vancouver GTFS-RT
gtfs_rt_yvr <- feeds(provider = "TransLink Vancouver", data_type = "gtfs_rt")

# read URL for GTFS-RT vehicle position
mobdb_feed_url(gtfs_rt_yvr$id[1])

# TransLink, like many agencies, requires authentication for their RT API
# Get authentication information from Mobility Database
yvr_rt_auth <- gtfs_rt_yvr$source_info[1, ]

# Get column names
names(yvr_rt_auth)

# Get registration URL
yvr_rt_auth$authentication_info_url

# Get API key parameter
yvr_rt_auth$api_key_parameter_name

Parsing GTFS-Realtime with RProtoBuf

Once you have the URL, you can use RProtoBuf to decode the Protocol Buffer data:

library(RProtoBuf)

gtfs_rt_example <- feeds(provider = "sample-agency", data_type = "gtfs_rt")

# read URL for GTFS-RT vehicle position
example_url <- mobdb_feed_url(gtfs_rt_example$id[1])

# Download and read the GTFS-RT .proto definition
# Available at: https://gtfs.org/documentation/realtime/gtfs-realtime.proto
proto_url <- "https://raw.githubusercontent.com/google/transit/master/gtfs-realtime/proto/gtfs-realtime.proto"
proto_file <- tempfile(fileext = ".proto")
download.file(proto_url, proto_file, quiet = TRUE)
readProtoFiles(proto_file)

# Fetch and parse the feed
con <- url(gtfs_rt_example, "rb")
feed <- read(transit_realtime.FeedMessage, con)
close(con)

# Access vehicle positions
for (entity in feed$entity) {
  if (entity$has("vehicle")) {
    vehicle <- entity$vehicle
    cat("Vehicle ID:", vehicle$vehicle$id, "\n")
    cat("Position:", vehicle$position$latitude, ",", vehicle$position$longitude, "\n")
    cat("Timestamp:", vehicle$timestamp, "\n\n")
  }
}

Note: The GTFS-Realtime Protocol Buffer definition is maintained in the Google Transit GitHub repository. The example above downloads it automatically, but you can also save it locally for repeated use.

GBFS data structure

GBFS, or the General Bikeshare Feed Specification, is a “real-time, pull-based, data specification that describes the current status of a mobility system”. In other words, GBFS defines the status of shared mobility systems like bikeshare or scooter share systems.

Unlike GTFS-Schedule or GTFS-Realtime, it is structured as a series of JSON files.

As of the latest version (v3), two files are required of any GBFS feed:

  • gbfs.json - an auto-discovery file that core information about a shared mobility feed and shares what other files are available
  • system_information.json - defines core information about the shared mobility system

Other feed types are conditionally required depending on the system type (e.g. dock vs dockless), while others are fully optional.

A full list of the GBFS feed entities and their requirements are available on the official GBFS reference documentation or on the GitHub repo.

How mobdb handles GBFS

mobdb, whether through feeds() or mobdb_search() will surface all available GBFS feeds on the Mobility Database. The URL retrieved from mobdb_feed_url() will be the auto-discovery endpoint (i.e. gbfs.json).

# Search GBFS feeds in Vancouver
gbfs_yvr <- mobdb_search("vancouver", data_type = "gbfs")

yvr_feed <- mobdb_feed_url(gbfs_yvr$id[1])

yvr_feed

This can then be passed on to jsonlite for parsing or to the dedicated gbfs package for discovery.

library(gbfs)

gbfs_yvr <- mobdb_search("vancouver", data_type = "gbfs")
yvr_feed <- mobdb_feed_url(gbfs_yvr$id[1])
yvr_station_info <- get_station_information(yvr_feed, output = "return")

Next Steps

This vignette covered the basics of discovering GTFS-Realtime and GBFS feeds using mobdb. For working with the actual feed data:

  • GTFS-Realtime: Use the RProtoBuf package to decode Protocol Buffer data
  • GBFS: Use the gbfs package for streamlined access to bikeshare data, or jsonlite for direct JSON parsing

For discovering GTFS Schedule feeds and working with historical transit data, see the main package documentation and other vignettes.