skip to Main Content

Querying for pedestrian related collisions

Collision data is inherently complicated since it attempts to aggregate all the elements of a collision into a nice, standardized database. The breadth of information available contributes to the complexity and can make it quite daunting for someone to understand without extensive experience working with the data. I wanted to address some of the more common questions that I have heard over the years and hopefully provide sufficient explanations in a series of blog posts.

Given the interest in the built environment and the focus on improving conditions for pedestrians, a common data question is: How many pedestrian collisions were there? A rather simple question, but there could be several different answers if you were looking at California’s SWITRS collision data. In order to answer the question, you need to understand what you are asking, what you are querying from the data and limitations of the data. There are five potential pedestrian related fields of interest in SWITRS, but they may generate different results. The following table shows queries made for local road collisions in San Francisco in 2011 (one of the sample datasets available for the RoadSafe GIS demo service):


Field Values Collision Count Results
Collision Type image02 734
Motor Vehicle Involved With image01 806
Pedestrian Involvement image14 848
Party Type (Party Table)  image16 848
Victim Role (Victim Table) image09 810

Why such different numbers? Let’s break it down one field at a time. The California Collision Investigation Manual states that the type of collision should be determined by “the first injury or damage-causing event.” Therefore, the officer must choose a single collision type which may not necessarily be a vehicle/pedestrian collision even if a pedestrian was involved. For example, if a car rear-ended another car that subsequently struck a pedestrian it would be classified as a rear end collision since that was the first event. Therefore only the 734 collisions where the pedestrian was struck in the initial movement of the collision are classified as vehicle/pedestrian.

The second field, Motor Vehicle Involved With, is probably the most complicated of the group since there are several exceptions to the guidelines along with other potential inconsistencies. For pedestrian collisions, the field is generally supposed to match the type of collision. However, in many cases if the collision type is not vehicle/pedestrian the Motor Vehicle Involved With will still be marked as pedestrian. Therefore, more pedestrian collisions would be seen than from collision type. Without delving into more details, this field is probably not the best to use in this case. Feel free to review the manual on your own for this one!

What about the 848 collisions marked as pedestrian involved? It means that 114 (848 minus 734) collisions involved a pedestrian struck in a secondary movement. This 848 is probably the best answer to our original question. You can also see that querying by the party type results in 848 collisions as well. That is because the pedestrian involved collision field is actually derived from the party types.

Finally, what about the 810 collisions when a pedestrian victim is queried from the data? The victim field summarizes the persons that were injured in a collision. 810 of the 834 collisions involving a pedestrian resulted in an injury to the pedestrian. In the other 24 cases, the pedestrian was fortunate to avoid an injury. These collisions are also classified as property damage only (PDO). (Note that querying by victim can also cause confusion in the context of collisions vs injuries. We will address this exact topic in a future blog post.)

So the question again: How many pedestrian collisions were there? 848. It is easiest and most accurate to look at the pedestrian involvement field. As you work more with the data, you may also discover that the numbers do not always add up and results may be off by one or two collisions. Keep in mind that the data is never perfect, so the occasional inconsistency will exist in the database. There was actually one in this analysis, but I ignored it to avoid any confusion!

Latest Articles
April 27, 2020

California HSIP Cycle 10

Caltrans has announced the HSIP Cycle 10 call for projects. Applications are due September 4th, 2020 and an informational webinar…

June 13, 2016

California HSIP Cycle 8

Caltrans has announced the HSIP Cycle 8 call for projects. Applications are due August 12th, 2016, so you should already…

June 9, 2016

San Diego geocoded SWITRS data

As part of our ongoing work to test and improve our geocoding processes, we recently looked at SWITRS collision data…

Back To Top