#usda#fooddata-central#foundation-foods#branded-foods#fndds

USDA FDC: Foundation vs SR Legacy vs Survey vs Branded — which to query when

Five sub-databases under one API. Picking the right one is most of the battle.

By Alex Brennan · Published September 22, 2025

What’s in each

The five sub-databases:

Sub-database	Items	Source	Update cadence
Foundation	~250	New USDA analytical work	Quarterly
SR Legacy	~7,800	USDA SR 28 (frozen 2018)	None (frozen)
Survey/FNDDS	~7,000	NHANES “what we ate”	~2-yearly
Branded	~1.4M+	Manufacturer-submitted label panels	Continuous
Experimental	<100	Active USDA research	Irregular

A query like “banana” hits all five if unfiltered. The right answer depends on what you’re trying to do.

Foundation Foods

The shiny new tier. USDA reanalyses a small number of foods with current analytical methods and publishes complete macronutrient + extensive micronutrient panels. Coverage is small (~250 items as of late 2025) but quality is the highest in the whole FDC.

Use when: you want the most accurate possible single answer for a common raw food. “What’s a banana actually have in it?” Foundation Foods is the best answer if it’s there.

Don’t use for: branded products, prepared dishes, or anything with sub-300 in their item count.

SR Legacy (Standard Reference 28)

The historic backbone. Frozen in 2018; these records will not change. ~7,800 entries covering most raw and lightly-processed foods, plus a substantial selection of common cooked dishes.

Use when: Foundation doesn’t have it. SR Legacy is “the rest of the iceberg” for raw foods.

Don’t use for: anything packaged with a brand name, anything from the last 7 years that’s gotten new analysis.

Caveat: the numbers are accurate for the items as analysed in 2018 and earlier. Modern broiler chickens have different fat profiles than the 2018 records. The discrepancy is usually <5% but it exists.

Survey (FNDDS)

The “what people eat” database. Items here look like “Yogurt, fruit, lowfat, with low-calorie sweetener, prepared” — phrasings designed to match how respondents actually describe food in 24-hour recall surveys.

Use when: you want a typical-prepared-version, e.g. “scrambled eggs as commonly prepared” rather than “raw egg.” The Survey records are weighted by typical preparation methods.

Don’t use for: ingredient-level analysis, brand-specific lookups.

The Survey records have FNDDS food codes (e.g. 13280300) which can also be useful as stable cross-references to NHANES data if you’re doing research-grade analysis.

Branded Foods

The big one — over 1.4 million products with manufacturer-submitted nutrition facts. This is what powers most “barcode lookup” features.

Use when: you have a specific product (with a GTIN/barcode) and want manufacturer-stated label data.

Don’t use for: “what’s the average banana have in it” — Branded will drown it in branded banana products with various preparations.

Caveat: Branded data is whatever the manufacturer submitted. Errors and inconsistencies exist. We have seen cereal records with 0g sugar when the label clearly says 12g. Caveat lector.

Experimental Foods

USDA’s bench. Small. Mostly relevant for researchers who care about specific cultivars or experimental crops. Not useful for tracker apps unless you’re specifically working with a researcher.

Search strategy

For a calorie tracker building a search-as-you-type:

def search_priority(query: str) -> list:
    """
    Returns FDC search results biased toward analytical accuracy
    while preserving the option to find branded products.
    """
    # 1. First search Foundation + SR Legacy for analytical hits
    analytical = fdc_search(query, dataType="Foundation,SR Legacy")
    
    # 2. Then search Branded for branded hits, capped
    branded = fdc_search(query, dataType="Branded", pageSize=10)
    
    # 3. Survey for prepared-form hits, also capped
    survey = fdc_search(query, dataType="Survey (FNDDS)", pageSize=5)
    
    # Merge, dedup by description, prioritise analytical
    seen = set()
    out = []
    for f in analytical + survey + branded:
        desc = f['description'].lower()
        if desc in seen:
            continue
        seen.add(desc)
        out.append(f)
    
    return out[:25]

This is roughly what Cronometer’s search does internally, by behaviour. Waistline does something similar but with OFF first and FDC as fallback.

Filtering by data type

In the API:

# only foundation
curl "...&query=banana&dataType=Foundation"

# foundation + sr legacy
curl "...&query=banana&dataType=Foundation,SR%20Legacy"

# all but branded
curl "...&query=banana&dataType=Foundation,SR%20Legacy,Survey%20(FNDDS),Experimental"

Note Survey (FNDDS) has a space and parens that need URL-encoding.

When the tracker should ask the user

For records where Foundation, SR Legacy, and Branded all have a hit, the right user experience is to disambiguate. “Banana, raw” (Foundation) is not the same as “Banana, fresh, organic” (Branded entry from Whole Foods Market) is not the same as “Banana smoothie, prepared” (Survey FNDDS).

Cronometer surfaces these distinctly. Most FOSS apps just take the first hit, which is usually the right call but occasionally embarrassing.

References

FDC dataset documentation: fdc.nal.usda.gov/data-documentation.html
NHANES / FNDDS background: ars.usda.gov/northeast-area/beltsville-md-bhnrc/beltsville-human-nutrition-research-center/food-surveys-research-group/docs/fndds-overview/
USDA FDC API getting started
USDA bulk → Postgres