Uses and Users of Confidential Commodity Flow Survey Data in the United States
Young-Jun Kweon, WenWei Zeng, Mike Carter, Ryan Grube, Cha-Chi FanThe Commodity Flow Survey (CFS), a shipper-based survey in the U.S., produces public and confidential data about goods movement. This study examines the characteristics of the uses and users of confidential CFS data by analyzing two datasets: 44 proposals requesting confidential CFS data and the metadata of 849 proposals requesting any confidential data (CFS and non-CFS). The metadata was analyzed to compare proposals requesting CFS with those not requesting it. Both qualitative (human review) and quantitative (machine review using the latent Dirichlet allocation topic model) were applied. Additionally, Transport Research International Documentation (TRID) records were reviewed to compare research using public versus confidential CFS data. Human review of the 44 proposals summarized submitters, research areas, variables, and feedback. Results show high valuation of CFS data, with emphasis on economic topics, strong interest in additional geographic detail, and preference for shipment value over weight. Term frequency analysis showed “supply chain” as the most frequent two-word term in the 44 proposals, while “freight,” most common in TRID records, did not rank among the top 25 one-word terms. Topic models confirmed the qualitative review’s findings, highlighting business and economic themes as central among the 44 proposals. Analysis of the 849 metadata entries revealed the CFS-related topic differed substantially from others, underscoring the dataset’s unique and irreplaceable nature. The qualitative and quantitative approaches complement each other, offering a comprehensive understanding of CFS’s research value. Additional findings are discussed in the paper.