Waymo

The dataset includes:

  • 103,354, 20s 10Hz segments (over 20 million frames), mined for interesting interactions

  • 574 hours of data

  • Sensor data
    • 4 short-range lidars

    • 1 mid-range lidar

  • Object data
    • 10.8M objects with tracking IDs

    • Labels for 3 object classes - Vehicles, Pedestrians, Cyclists

    • 3D bounding boxes for each object

    • Mined for interesting behaviors and scenarios for behavior prediction research, such as unprotected turns, merges, lane changes, and intersections

    • 3D bounding boxes are generated by a model trained on the Perception Dataset and detailed in our paper: Offboard 3D Object Detection from Point Cloud Sequences Map data

  • 3D map data for each segment

  • Locations include: San Francisco, Phoenix, Mountain View, Los Angeles, Detroit, and Seattle

  • Added entrances to driveways (the map already Includes lane centers, lane boundaries, road boundaries, crosswalks, speed bumps and stop signs)

  • Adjusted some road edge boundary height estimates

1. Install Requirements

First of all, we have to install tensorflow and Protobuf:

pip install tensorflow==2.11.0
conda install protobuf==3.20

Note

You may fail to install protobuf if using pip install protobuf==3.20. If so, install via conda install protobuf=3.20.

2. Download TFRecord

Waymo motion dataset is at Google Cloud. For downloading all datasets, gsutil is required. The installation tutorial is at https://cloud.google.com/storage/docs/gsutil_install.

Login you google account via:

gcloud init

After this, you can access all data and download them to current directory ./ by (don’t forget the dot!):

gsutil -m cp -r "gs://waymo_open_dataset_motion_v_1_2_0/uncompressed/scenario" .

Or one just can download a part of the dataset using command like:

gsutil -m cp -r "gs://waymo_open_dataset_motion_v_1_2_0/uncompressed/scenario/training_20s" .

The downloaded data should be stored in a directory like this:

waymo
├── training_20s/
|   ├── training_20s.tfrecord-00000-of-01000
|   ├── training_20s.tfrecord-00001-of-01000
|   └── ...
├── validation/
|   ├── validation.tfrecord-00000-of-00150
|   ├── validation.tfrecord-00001-of-00150
|   └── ...
└── testing/
    ├── testing.tfrecord-00000-of-00150
    ├── testing.tfrecord-00001-of-00150
    └── ...

3. Build Waymo Database

Run the following command to extract scenarios in any directory containing tfrecord.

Here we take converting raw data in training_20s as an example:

python -m scenarionet.convert_waymo -d /path/to/your/database --raw_data_path ./waymo/training_20s --num_workers 64

Now all converted scenarios will be placed at /path/to/your/database and are ready to be used in your work.

Note

When running the conversion, please double check whether GPU is being used. This converter should NOT use GPU. We have disable GPU usage by os.environ["CUDA_VISIBLE_DEVICES"] = "".

Known Issues: Waymo

N/A