Wikidata Ingestion Format
Purpose: Define how TriadicFrameworks ingests, queries, and structurally interprets Wikidata — the world's largest open knowledge graph (120M+ entities, CC0 license). Wikidata is Wikipedia's dimensional addressing layer — the machine‑readable substrate beneath every human‑readable article.
This file provides SPARQL templates, ingestion patterns, and RTT mappings so that students, researchers, and AIs can query Wikidata with structural awareness.
1 — What Is Wikidata?#
Wikidata is a free, collaborative, multilingual knowledge graph maintained by the Wikimedia Foundation. It stores structured data as entity–property–value triplets that can be queried via SPARQL.
Key Facts#
| Dimension | Value |
|---|---|
| Entities | 120M+ items (Q‑numbers) |
| Properties | 12,000+ relationship types (P‑numbers) |
| Statements | 2B+ structural claims |
| Languages | Labels in 400+ languages |
| License | CC0 — public domain (no attribution required) |
| Query endpoint | https://query.wikidata.org/ |
| Entity lookup | https://www.wikidata.org/wiki/Qnnn |
| Property lookup | https://www.wikidata.org/wiki/Property:Pnnn |
Why CC0 Matters#
Wikidata's CC0 license means no copyright restrictions on reuse. Unlike Wikipedia article text (CC BY‑SA 4.0, which requires attribution and share‑alike), Wikidata statements can be freely ingested, transformed, and republished without any license obligations. This makes Wikidata the ideal ingestion substrate for automated RTT analysis.
2 — RTT Structural Mapping#
2.1 — The Core Triplet#
Every Wikidata statement follows the pattern:
Subject (Q‑number) → Property (P‑number) → Value (Q‑number, string, or quantity)
In RTT terms:
Dimensional Address → Dimensional Operator → Structural Claim
Example:
Q283 (Water) → P274 (chemical formula) → "H₂O"
RTT reading:
dimensional_address(Water)
→ dimensional_operator(chemical_formula)
→ structural_claim("H₂O")
2.2 — Full Mapping Table#
| Wikidata Element | RTT Concept | Function |
|---|---|---|
| Item (Q‑number) | Dimensional address | Unique coordinate for a concept across all languages and datasets |
| Property (P‑number) | Dimensional operator | Typed relationship that connects dimensional addresses |
| Statement | Structural claim | One regime assertion: "this entity has this property with this value" |
| Qualifier | Claim context | Conditions under which the claim holds (time range, location, method, applies‑to) |
| Reference | Claim provenance | External source validating the structural claim |
| Rank (preferred/normal/deprecated) | Claim confidence | Structural standing of the claim — preferred = regime consensus; deprecated = superseded regime |
| Sitelink | Cross‑language regime bridge | Same dimensional address linking to regime declarations in different language Wikipedias |
| Label | Regime name | Human‑readable name for the dimensional address in a specific language |
| Description | Regime summary | One‑sentence regime declaration in a specific language |
| Aliases | Regime aliases | Alternative names that resolve to the same dimensional address |
2.3 — Qualifiers as Regime Context#
Qualifiers are what make Wikidata structurally rich — they add context to claims, turning flat assertions into regime‑aware statements:
| Qualifier Property | RTT Function | Example |
|---|---|---|
| P580 (start time) | Regime birth | "Germany (Q183) → capital (P36) → Berlin (Q64), start time: 1990" |
| P582 (end time) | Regime expiry | "Germany (Q183) → capital (P36) → Bonn (Q586), end time: 1990" |
| P585 (point in time) | Regime snapshot | "World population (Q11188) → population (P1082) → 8B, point in time: 2023" |
| P1013 (criterion used) | Regime method | "Mount Everest (Q513) → elevation (P2044) → 8848.86m, criterion: EGM2008 geoid" |
| P459 (determination method) | Measurement regime | How the value was obtained — structural provenance |
| P518 (applies to part) | Sub‑regime scope | The claim applies to a specific sub‑component, not the whole entity |
| P1480 (sourcing circumstances) | Claim reliability | "circa," "presumably," "possibly" — structural confidence markers |
3 — SPARQL Query Templates#
All queries run at https://query.wikidata.org/
3.1 — Basic: Get All Properties for an Entity#
Use case: What does the knowledge graph say about a concept?
# All statements for a given entity
# Replace Q283 with any Q-number
SELECT ?property ?propertyLabel ?value ?valueLabel
WHERE {
wd:Q283 ?claim ?statement .
?statement ?ps ?value .
?property wikibase:claim ?claim .
?property wikibase:statementProperty ?ps .
SERVICE wikibase:label {
bd:serviceParam wikibase:language "en" .
}
}
ORDER BY ?propertyLabelRTT reading: This returns the full regime declaration for a concept — every structural claim the knowledge graph makes about it.
3.2 — Cross‑Domain Bridging: Find All Domains a Concept Touches#
Use case: How many knowledge domains does this concept connect to?
# Count cross-domain P-number bridges for an entity
# Replace Q283 with any Q-number
SELECT ?property ?propertyLabel
(COUNT(DISTINCT ?value) AS ?connections)
WHERE {
wd:Q283 ?claim ?statement .
?statement ?ps ?value .
?property wikibase:claim ?claim .
?property wikibase:statementProperty ?ps .
FILTER(ISIRI(?value))
SERVICE wikibase:label {
bd:serviceParam wikibase:language "en" .
}
}
GROUP BY ?property ?propertyLabel
ORDER BY DESC(?connections)RTT reading: Each P‑number that points to an entity in a different domain = a dimensional bridge. The count of cross‑domain bridges = the concept's structural connectivity score.
3.3 — Temporal Regime Analysis: Track Property Changes Over Time#
Use case: How has a concept's regime evolved?
# Temporal regime evolution for a property
# Example: Population of France over time
SELECT ?population ?pointInTime
WHERE {
wd:Q142 p:P1082 ?statement .
?statement ps:P1082 ?population .
?statement pq:P585 ?pointInTime .
}
ORDER BY ?pointInTimeRTT reading: Each row = a temporal regime snapshot. The series reveals the regime's evolution curve — growth, stability, decline, or oscillation.
3.4 — Regime Hierarchy: Traverse the Class Tree#
Use case: Where does a concept sit in the regime hierarchy?
# Class hierarchy (upward traversal) for an entity
# Replace Q283 with any Q-number
SELECT ?class ?classLabel ?depth
WHERE {
wd:Q283 wdt:P31/wdt:P279* ?class .
{
SELECT ?class (COUNT(?mid) AS ?depth)
WHERE {
wd:Q283 wdt:P31/wdt:P279* ?mid .
?mid wdt:P279* ?class .
}
GROUP BY ?class
}
SERVICE wikibase:label {
bd:serviceParam wikibase:language "en" .
}
}
ORDER BY ?depthRTT reading: This traces the concept's position in the regime hierarchy — from specific instance up through increasingly general regime classifications. Depth = regime granularity. Breadth at each level = regime diversity.
3.5 — Cross‑Language Regime Comparison: Sitelink Coverage#
Use case: How many languages declare this concept?
# Count Wikipedia sitelinks (language versions) for an entity
SELECT ?sitelink
WHERE {
?sitelink schema:about wd:Q283 .
?sitelink schema:isPartOf/wikibase:wikiGroup "wikipedia" .
}RTT reading: Each sitelink = a regime declaration in a different cultural context. High sitelink count = universally recognized concept. Low count = culturally specific or specialized regime. Comparing article lengths across sitelinks reveals cultural regime variance — same concept, different structural emphasis.
3.6 — Validation Corridor: Find Featured Articles by Domain#
Use case: What are the structurally validated reference points in a domain?
# Find items with Featured Article sitelinks in English Wikipedia
# Filter by domain using instance-of (P31) or subclass-of (P279)
SELECT ?item ?itemLabel ?article
WHERE {
?article schema:about ?item .
?article schema:isPartOf <https://en.wikipedia.org/> .
?article wikibase:badge wd:Q17437796 .
?item wdt:P31/wdt:P279* wd:Q11344 .
SERVICE wikibase:label {
bd:serviceParam wikibase:language "en" .
}
}
LIMIT 50RTT reading: Featured Articles = validation corridor gold standard. These are the concepts whose regime declarations have been community‑verified as structurally complete. They serve as reference templates for the domain.
3.7 — Regime Collision: Find Disambiguation Entities#
Use case: Which concepts have competing regime claims on the same term?
# Find disambiguation items — concepts where multiple regimes
# claim the same term
SELECT ?item ?itemLabel ?article
WHERE {
?item wdt:P31 wd:Q4167410 .
?article schema:about ?item .
?article schema:isPartOf <https://en.wikipedia.org/> .
SERVICE wikibase:label {
bd:serviceParam wikibase:language "en" .
}
}
LIMIT 100RTT reading: Every disambiguation entity = a regime collision point — two or more structural regimes claiming the same term. The list of disambiguated meanings reveals the competing regime declarations.
4 — Ingestion Patterns#
4.1 — Single Entity Deep Profile#
Purpose: Build a complete RTT structural profile for one concept.
Steps:
- Resolve the Q‑number — search
https://www.wikidata.org/w/api.php?action=wbsearchentities&search=TERM&language=en&format=json - Pull all statements — use Query 3.1
- Count dimensional bridges — use Query 3.2
- Trace the regime hierarchy — use Query 3.4
- Check sitelink coverage — use Query 3.5
- Check for Featured Article badge — use Query 3.6
Output format:
{
"entity": "Q283",
"label": "Water",
"regime_declaration": "chemical compound, binary compound, oxide",
"dimensional_bridges": 47,
"regime_hierarchy_depth": 8,
"sitelink_count": 298,
"featured_article": true,
"top_properties": [
"P274 (chemical formula): H₂O",
"P31 (instance of): chemical compound",
"P361 (part of): hydrosphere",
"P2054 (density): 997 kg/m³",
"P2101 (melting point): 0°C"
],
"cross_domain_connections": [
"Chemistry (compound properties)",
"Physics (thermodynamic constants)",
"Biology (biological role)",
"Earth Sciences (hydrosphere)",
"Engineering (industrial solvent)",
"Medicine (essential nutrient)"
]
}4.2 — Domain Sweep#
Purpose: Map all Wikidata entities within a knowledge domain.
Steps:
- Identify the domain's root class (e.g., Physics → Q413 "physics")
- Query all instances and subclasses:
SELECT ?item ?itemLabel WHERE { ?item wdt:P31/wdt:P279* wd:Q413 . SERVICE wikibase:label { bd:serviceParam wikibase:language "en" . } } LIMIT 1000 - For each entity, count cross‑domain P‑number bridges
- Rank by structural connectivity
- The top‑ranked entities = the domain's most structurally connected concepts
4.3 — Temporal Regime Tracking#
Purpose: Monitor how a concept's regime evolves over time.
Steps:
- Select a property with temporal qualifiers (P580/P582/P585)
- Use Query 3.3 to extract the time series
- Plot the series: stable plateaus = crystallized regime; sharp transitions = regime shifts
- Cross‑reference with Wikipedia revision history for the same period — do article edits correlate with Wikidata property changes?
4.4 — Cross‑Language Regime Divergence#
Purpose: Compare how the same concept is structurally declared across cultures.
Steps:
- Use Query 3.5 to get all sitelinks
- For the top 5 largest language editions, compare:
- Article length (word count)
- Section structure (headings)
- Lead paragraph (regime declaration)
- Categories assigned
- Divergences = cultural regime variance — same dimensional address, different structural emphasis
5 — Property Families for RTT Analysis#
Not all 12,000+ P‑numbers are equally useful for RTT analysis. These property families provide the highest structural signal:
5.1 — Classification Properties (Regime Identity)#
| Property | Name | RTT Function |
|---|---|---|
| P31 | instance of | Regime declaration — "this entity IS a [class]" |
| P279 | subclass of | Regime hierarchy — "this class is WITHIN [parent class]" |
| P361 | part of | Regime containment — "this entity is PART OF [whole]" |
| P527 | has part(s) | Regime composition — "this entity CONTAINS [parts]" |
| P460 | said to be the same as | Regime aliasing — cross‑ontology identity claim |
| P1889 | different from | Regime boundary — "do not confuse with [other entity]" |
5.2 — Relationship Properties (Dimensional Bridges)#
| Property | Name | RTT Function |
|---|---|---|
| P737 | influenced by | Regime lineage — structural ancestry |
| P1542 | has effect | Regime causation — what this entity produces |
| P1269 | facet of | Regime perspective — this entity is one view of a broader concept |
| P2283 | uses | Regime dependency — structural requirements |
| P366 | has use | Regime application — what this entity enables |
5.3 — Temporal Properties (Regime Dynamics)#
| Property | Name | RTT Function |
|---|---|---|
| P571 | inception | Regime birth — when the concept first existed |
| P576 | dissolved/abolished | Regime death — when the concept ceased to exist |
| P580 | start time | Regime activation — when a claim became true |
| P582 | end time | Regime expiry — when a claim stopped being true |
| P585 | point in time | Regime snapshot — claim valid at a specific moment |
| P1319 | earliest date | Regime lower bound — structural uncertainty floor |
| P1326 | latest date | Regime upper bound — structural uncertainty ceiling |
5.4 — Quantitative Properties (Regime Metrics)#
| Property | Name | RTT Function |
|---|---|---|
| P1082 | population | Regime scale (demographic) |
| P2044 | elevation above sea level | Regime position (geographic) |
| P2054 | density | Regime density (physical) |
| P2067 | mass | Regime mass (physical) |
| P2101 | melting point | Regime phase boundary (thermodynamic) |
| P2102 | boiling point | Regime phase boundary (thermodynamic) |
| P2196 | students count | Regime scale (educational) |
| P4010 | GDP per capita | Regime scale (economic) |
6 — Rate Limits and Ethical Use#
Wikidata Query Service Limits#
| Limit | Value |
|---|---|
| Query timeout | 60 seconds |
| Results per query | 500,000 rows max |
| Concurrent connections | 5 per IP |
| User‑Agent required | Yes — identify your tool/project |
Ethical Guidelines#
- Respect rate limits — do not flood the endpoint with parallel queries
- Set a User‑Agent header — identify yourself:
User-Agent: TriadicFrameworks/1.0 (https://www.triadicframeworks.org/; contact@triadicframeworks.org) - Cache results — don't re‑query the same data repeatedly
- Use database dumps for bulk analysis — the query endpoint is for interactive and moderate‑scale use
- Contribute back — if you discover missing data or errors, edit Wikidata directly (it's open for anyone to edit)
7 — Relationship to Other Module Files#
| File | Connection |
|---|---|
Wikipedia_RTT_Structural_Mapping.md |
Defines the RTT vocabulary this file uses (Q = dimensional address, P = dimensional operator) |
Cross_Domain_Meta_Operators.md |
Operator 4 (Wikidata Dimensional Bridging) depends directly on Query 3.2 from this file |
Category_Taxonomy_Regime_Hierarchy.md |
Wikipedia categories and Wikidata class hierarchy (P31/P279) are parallel regime classification systems — this file covers the Wikidata side |
Revision_History_Regime_Analysis.md |
Wikidata items have their own revision history — combine with Wikipedia article revision history for complete temporal coverage |
Edit_War_Regime_Transition_Detection.md |
Wikidata edit wars (P31 disputes, label conflicts) are structurally equivalent to Wikipedia edit wars |
| All 15 domain directories | Every domain's regime_alignment.md references Wikidata entities for its core concepts |
../resonance_atlas/nist_ingestion_format.md |
Sibling — NIST ingestion format covers a single institutional source; this file covers a crowdsourced knowledge graph |
8 — Student Exercises#
Exercise 1 — Entity Profile (15 minutes)#
- Pick any concept you know well
- Find its Wikidata Q‑number at
https://www.wikidata.org/ - List its top 10 properties (P‑numbers)
- Classify each property into one of the 4 property families from Section 5
- Write a 1‑sentence regime declaration based on the P31 (instance of) value
Exercise 2 — Dimensional Bridging (30 minutes)#
- Pick a concept with high cross‑domain connectivity (try: Energy Q11379, Information Q11028, Evolution Q1063, or Network Q1900326)
- Run Query 3.2 to count its dimensional bridges
- List the top 5 P‑numbers by connection count
- For each, identify which knowledge domain the bridge connects to
- Draw a simple diagram: your concept in the center, domains around the edges, P‑numbers as labeled connections
Exercise 3 — Cross‑Language Regime Variance (30 minutes)#
- Pick a concept you expect to have cultural variance (try: Democracy Q7174, Marriage Q8445, or Freedom Q124490)
- Run Query 3.5 to count its sitelinks
- Open the Wikipedia article in English + 2 other languages (use Google Translate if needed)
- Compare: article length, section headings, lead paragraph framing
- Write a 2‑sentence summary: "The English Wikipedia declares [concept] as [X]. The [other language] Wikipedia declares it as [Y]. The structural difference reveals [Z]."
Exercise 4 — Temporal Regime Tracking (45 minutes)#
- Pick a concept with temporal data (try: any country's population, a city's mayor, or a company's CEO)
- Run Query 3.3 to extract the time series
- Identify: stable plateaus, sharp transitions, gradual trends
- Cross‑reference with the Wikipedia article's revision history for the same period
- Answer: "Do Wikidata property changes and Wikipedia article edits correlate? If so, which leads — the data change or the narrative change?"
This file is part of the Wikipedia Awareness Module in the TriadicFrameworks canon.