Architecture of BEJSON

Document Name: Intro to BEJSON

The Architecture of
BEJSON

A strict, self-describing tabular data format engineered for positional integrity, rapid parsing, strong typing, and absolute human-machine interoperability.

01. The Origin & Philosophy

For decades, data storage has been a war between two opposing forces: Human Readability versus Machine Efficiency.

Traditional relational databases (SQL, SQLite) win the efficiency war. They enforce strict schemas and map relationships perfectly. But they fail humans. They are opaque, binary black-boxes that require specialized engines, servers, and drivers just to view the data inside them.

Standard JSON wins the readability war. It is pure text. But at enterprise scale, it becomes a structural nightmare. If you store 10 million financial transactions in JSON, you are forcing the computer to read the text strings "transaction_id" , "amount" , and "timestamp" 10 million times over. It destroys parsing speed and balloons file sizes.

The BEJSON Solution: Positional Integrity

BEJSON is the armistice. It is a highly-structured tabular format built natively on JSON. By declaring a single, unyielding Fields schema at the top of the document, the data payload drops all repetitive keys. The data becomes a strict array where the index of the value maps flawlessly to the index of the declared field. It parses at blistering speeds, cuts file bloat by fractions, and remains entirely readable in a basic text editor.

02. The Six Mandates

BEJSON does not tolerate ambiguity. Every single document, regardless of its variant, must adhere to six mandatory top-level keys. If a parser encounters a document missing even one of these keys, the document is mathematically invalid.

1. Format

Must be exactly "BEJSON" .

2. Format_Version

Dictates strict behavioral rules: "104" , "104a" , or "104db" .

3. Format_Creator

Must be exactly "Elton Boehnen" .

4. Records_Type

An array of strings declaring the precise entity name(s) housed within.

5. Fields

The schema declaration. An array of objects defining names and strict types ( string, integer, number, boolean, array, object ).

6. Values

The raw data payload. A strict array of arrays. Its inner length must perfectly match the length of the Fields array.

The Immutable Null Rule

Because BEJSON operates on exact positional arrays, you cannot simply "omit" a missing field as you would in standard JSON. To maintain structural length, absent data must be explicitly padded with null . null is legally accepted across all declared data types.

Format 104 The Heavy Lifter

When you are logging millions of identical records, you do not need metadata. You need raw, unadulterated throughput. BEJSON 104 is built for maximum density.

  • 1 Records_Type must contain exactly one entity string.
  • 2 Custom top-level headers are absolutely forbidden.
  • 3 Complex data types (arrays, objects) are fully permitted within records.

Schema Study: High-Frequency Trading Engine

Financial engines produce millions of ticks per hour. Standard JSON would collapse under the weight of repeated keys. Format 104 strips it down to the absolute mathematical minimum while maintaining a schema strict enough to prevent silent corruption. Notice the complex array ["HFT-BOT-9", "ROUTER-A"] nested seamlessly within the tabular data.

{
  "Format": "BEJSON",
  "Format_Version": "104",
  "Format_Creator": "Elton Boehnen",
  
  "Records_Type": ["MarketTick"],
  
  "Fields": [
    {"name": "timestamp_ns", "type": "integer"},
    {"name": "ticker_symbol", "type": "string"},
    {"name": "price_usd", "type": "number"},
    {"name": "volume_executed", "type": "integer"},
    {"name": "is_dark_pool", "type": "boolean"},
    {"name": "routing_path", "type": "array"}
  ],
  
  "Values": [
    [1715588100125000, "NVDA", 895.50, 100, false, ["HFT-BOT-9", "ROUTER-A"]],
    [1715588100125050, "AAPL", 183.25, 5000, true, null],
    [1715588100125088, "TSLA", 171.10, 50, false, ["RETAIL-GW"]]
  ]
}

Format 104a The Context Layer

Pure data tables are blind to the outside world. If you have 50 files of sensor data, how do you know what physical machine generated them? Format 104a trades inner complexity for outer context.

  • 1 Custom root-level metadata headers are allowed , but must use PascalCase .
  • 2 Record fields are restricted strictly to primitive types (no nested arrays or objects).
  • 3 Perfect for manifests, configuration files, and system health checks.

Schema Study: Submersible Deployment Manifest

Notice how the top of the file contains vital operational context (Vessel, Mission, Dive Limit). The actual tabular data below it maps out the primitive hardware status parameters. It serves as both a configuration document and a structured data table simultaneously.

{
  "Format": "BEJSON",
  "Format_Version": "104a",
  "Format_Creator": "Elton Boehnen",
  
  /* --- Contextual Metadata (PascalCase) --- */
  "Vessel_Class": "Nereus Deep-Sea Explorer",
  "Mission_Designation": "Mariana-Trench-Alpha",
  "Max_Dive_Limit_Meters": 11000,
  "Autopilot_Engaged": true,

  "Records_Type": ["HardwareStatus"],
  
  "Fields": [
    {"name": "component_id", "type": "string"},
    {"name": "power_draw_kw", "type": "number"},
    {"name": "requires_calibration", "type": "boolean"}
  ],
  
  "Values": [
    ["SONAR-ARRAY-MAIN", 4.5, false],
    ["BALLAST-PUMP-A", 12.0, true],
    ["EXTERNAL-LIGHTING", 1.2, false]
  ]
}

Format 104db Relational Text

The crowning achievement of BEJSON. Real systems have relationships. Patients have Blood Samples. Authors have Books. 104db allows multiple distinct entity schemas to coexist in a single text file, creating a highly portable, AI-native relational database.

The Mechanics of Null-Padding

  1. Records_Type declares multiple entities (e.g., ["Patient", "Sample"] ).
  2. The first column must act as a discriminator: {"name": "Record_Type_Parent", "type": "string"} .
  3. Every subsequent field is strictly owned by one specific entity via the "Record_Type_Parent" property.
  4. The Padding Law: When writing a "Patient" record, you must fill all the "Sample" columns with null . When writing a "Sample" record, all "Patient" columns are null .

Schema Study: Biotech Laboratory Tracker

Observe the visual architecture of the data. The LLM or human reader does not need to cross-reference multiple tables. The foreign key ( patient_id_fk ) immediately maps the Plasma Sample back to subject "Alexander Vance". The null-padding creates a visually distinct grid of relational logic.

{
  "Format": "BEJSON",
  "Format_Version": "104db",
  "Format_Creator": "Elton Boehnen",
  
  "Records_Type": ["Patient", "BioSample"],
  
  "Fields": [
    /* Discriminator (Mandatory position 0) */
    {"name": "Record_Type_Parent", "type": "string"},
    
    /* Entity A: Patient Fields */
    {"name": "patient_id", "type": "string", "Record_Type_Parent": "Patient"},
    {"name": "full_name",  "type": "string", "Record_Type_Parent": "Patient"},
    
    /* Entity B: BioSample Fields */
    {"name": "sample_id",    "type": "string", "Record_Type_Parent": "BioSample"},
    {"name": "sample_type",  "type": "string", "Record_Type_Parent": "BioSample"},
    {"name": "patient_id_fk","type": "string", "Record_Type_Parent": "BioSample"}
  ],
  
  "Values": [
    /* Patients (BioSample columns [3, 4, 5] are null-padded) */
    ["Patient", "PT-001", "Alexander Vance", null, null, null],
    ["Patient", "PT-002", "Dr. Evelyn Shaw", null, null, null],
    
    /* BioSamples (Patient columns [1, 2] are null-padded) */
    ["BioSample", null, null, "SMP-99A", "Blood Plasma",   "PT-001"],
    ["BioSample", null, null, "SMP-99B", "Bone Marrow",    "PT-001"],
    ["BioSample", null, null, "SMP-104", "Saliva Swab",    "PT-002"]
  ]
}

06. Immutable Rules

The Law of Evolution

You may append new fields to the very end of the Fields array. This is mathematically safe. You may never remove, reorder, or inject fields into the middle of the array. Doing so instantly destroys the positional integrity mapping of all legacy records.

Nomenclature Standards

  • » Entity fields must exclusively use snake_case .
  • » Root metadata headers in 104a must exclusively use PascalCase .
  • » Relational foreign keys in 104db must terminate with the _fk suffix (e.g., target_id_fk ).

The Limit of 104db

Format 104db is a scalpel, not a broadsword. Due to null-padding, file size grows exponentially as you add distinct entities. It is designed for lightweight applications, embedded AI context windows, and single-file relational portability—not for holding 100,000+ records across 50 distinct entity types. Know your limits.

BEJSON Architectural Spec • Positional Integrity • Data Portability


Related Content