JSON to JSON Schema

What is JSON Schema?

JSON Schema is a declarative language that allows you to annotate and validate JSON documents. It provides a contract for what JSON data is required for a given application and how it can be interacted with. JSON Schema is like a blueprint that describes the structure, data types, and constraints of JSON data.

Understanding the Conversion

Converting JSON to JSON Schema means transforming concrete data examples into abstract data specifications. You're essentially creating a template that describes what valid JSON data should look like.

JSON vs JSON Schema

AspectJSONJSON Schema
PurposeStore/transmit dataDescribe/validate data structure
ContentActual data valuesData type definitions and constraints
UsageData interchangeData validation and documentation
Example{"name": "John", "age": 30}{"type": "object", "properties": {...}}

Why Convert JSON to JSON Schema?

1. Data Validation

Ensure incoming data matches expected structure:

// JSON data
{
  "email": "[email protected]",
  "age": 25
}

// Generated JSON Schema for validation
{
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "type": "object",
  "properties": {
    "email": {
      "type": "string",
      "format": "email"
    },
    "age": {
      "type": "integer",
      "minimum": 0,
      "maximum": 150
    }
  },
  "required": ["email", "age"]
}

2. API Documentation

Document expected request/response formats:

// API response example
{
  "status": "success",
  "data": {
    "users": [
      {"id": 1, "name": "Alice"}
    ]
  }
}

// Schema for documentation
{
  "type": "object",
  "properties": {
    "status": {"type": "string", "enum": ["success", "error"]},
    "data": {
      "type": "object",
      "properties": {
        "users": {
          "type": "array",
          "items": {
            "type": "object",
            "properties": {
              "id": {"type": "integer"},
              "name": {"type": "string"}
            }
          }
        }
      }
    }
  }
}

3. Code Generation

Generate type definitions for programming languages:

// JSON Schema
{
  "type": "object",
  "properties": {
    "user": {
      "type": "object",
      "properties": {
        "name": {"type": "string"},
        "age": {"type": "integer"}
      }
    }
  }
}

// Generated TypeScript interface
interface User {
  name: string;
  age: number;
}

interface RootObject {
  user: User;
}

JSON Schema Structure

Basic Schema Components

  1. $schema - Specifies JSON Schema version
  2. type - Data type (object, array, string, number, integer, boolean, null)
  3. properties - Object property definitions
  4. items - Array item definitions
  5. required - Required properties
  6. constraints - Validation rules

Schema Template

{
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "type": "object",
  "title": "Schema Title",
  "description": "Schema Description",
  "properties": {
    // Property definitions
  },
  "required": [],
  "additionalProperties": false
}

Data Type Conversions

1. String Type

JSON Example:

{
  "name": "John Doe",
  "email": "[email protected]",
  "phone": "+1-555-0123"
}

Generated Schema:

{
  "type": "object",
  "properties": {
    "name": {
      "type": "string",
      "minLength": 1
    },
    "email": {
      "type": "string",
      "format": "email"
    },
    "phone": {
      "type": "string",
      "pattern": "^\\+[1-9]\\d{1,14}$"
    }
  }
}

2. Number Types

JSON Example:

{
  "age": 30,
  "salary": 50000.50,
  "score": 95.5
}

Generated Schema:

{
  "type": "object",
  "properties": {
    "age": {
      "type": "integer",
      "minimum": 0,
      "maximum": 150
    },
    "salary": {
      "type": "number",
      "minimum": 0
    },
    "score": {
      "type": "number",
      "minimum": 0,
      "maximum": 100
    }
  }
}

3. Boolean Type

JSON Example:

{
  "isActive": true,
  "verified": false
}

Generated Schema:

{
  "type": "object",
  "properties": {
    "isActive": {
      "type": "boolean"
    },
    "verified": {
      "type": "boolean"
    }
  }
}

4. Array Type

JSON Example:

{
  "tags": ["tech", "programming"],
  "scores": [85, 92, 78],
  "users": [
    {"id": 1, "name": "Alice"},
    {"id": 2, "name": "Bob"}
  ]
}

Generated Schema:

{
  "type": "object",
  "properties": {
    "tags": {
      "type": "array",
      "items": {
        "type": "string"
      },
      "minItems": 1,
      "uniqueItems": true
    },
    "scores": {
      "type": "array",
      "items": {
        "type": "integer",
        "minimum": 0,
        "maximum": 100
      }
    },
    "users": {
      "type": "array",
      "items": {
        "type": "object",
        "properties": {
          "id": {"type": "integer"},
          "name": {"type": "string"}
        },
        "required": ["id", "name"]
      }
    }
  }
}

5. Object Type

JSON Example:

{
  "address": {
    "street": "123 Main St",
    "city": "New York",
    "zipCode": "10001"
  }
}

Generated Schema:

{
  "type": "object",
  "properties": {
    "address": {
      "type": "object",
      "properties": {
        "street": {"type": "string"},
        "city": {"type": "string"},
        "zipCode": {"type": "string", "pattern": "^\\d{5}$"}
      },
      "required": ["street", "city", "zipCode"],
      "additionalProperties": false
    }
  }
}

6. Null and Mixed Types

JSON Example:

{
  "optionalField": null,
  "mixedValue": "could be string or number"
}

Generated Schema:

{
  "type": "object",
  "properties": {
    "optionalField": {
      "type": ["string", "null"]
    },
    "mixedValue": {
      "oneOf": [
        {"type": "string"},
        {"type": "number"}
      ]
    }
  }
}

Complete Conversion Examples

Example 1: User Profile

JSON Input:

{
  "user": {
    "id": 12345,
    "username": "johndoe",
    "email": "[email protected]",
    "profile": {
      "firstName": "John",
      "lastName": "Doe",
      "age": 28,
      "bio": "Software developer"
    },
    "preferences": {
      "theme": "dark",
      "notifications": {
        "email": true,
        "push": false
      }
    },
    "roles": ["user", "editor"],
    "lastLogin": "2024-01-15T10:30:00Z",
    "isActive": true
  }
}

Generated JSON Schema:

{
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "type": "object",
  "title": "User Profile Schema",
  "properties": {
    "user": {
      "type": "object",
      "properties": {
        "id": {
          "type": "integer",
          "minimum": 1
        },
        "username": {
          "type": "string",
          "minLength": 3,
          "maxLength": 20,
          "pattern": "^[a-zA-Z0-9_]+$"
        },
        "email": {
          "type": "string",
          "format": "email"
        },
        "profile": {
          "type": "object",
          "properties": {
            "firstName": {
              "type": "string",
              "minLength": 1,
              "maxLength": 50
            },
            "lastName": {
              "type": "string",
              "minLength": 1,
              "maxLength": 50
            },
            "age": {
              "type": "integer",
              "minimum": 13,
              "maximum": 120
            },
            "bio": {
              "type": "string",
              "maxLength": 500
            }
          },
          "required": ["firstName", "lastName", "age"],
          "additionalProperties": false
        },
        "preferences": {
          "type": "object",
          "properties": {
            "theme": {
              "type": "string",
              "enum": ["light", "dark", "auto"]
            },
            "notifications": {
              "type": "object",
              "properties": {
                "email": {"type": "boolean"},
                "push": {"type": "boolean"}
              },
              "required": ["email", "push"],
              "additionalProperties": false
            }
          },
          "required": ["theme", "notifications"],
          "additionalProperties": false
        },
        "roles": {
          "type": "array",
          "items": {
            "type": "string",
            "enum": ["user", "admin", "editor", "viewer"]
          },
          "minItems": 1,
          "uniqueItems": true
        },
        "lastLogin": {
          "type": "string",
          "format": "date-time"
        },
        "isActive": {
          "type": "boolean"
        }
      },
      "required": ["id", "username", "email", "profile", "isActive"],
      "additionalProperties": false
    }
  },
  "required": ["user"],
  "additionalProperties": false
}

Example 2: E-commerce Product

JSON Input:

{
  "product": {
    "id": "prod_123",
    "name": "Gaming Laptop",
    "description": "High-performance gaming laptop",
    "price": 1299.99,
    "currency": "USD",
    "inStock": true,
    "categories": ["electronics", "computers", "gaming"],
    "specifications": {
      "brand": "TechCorp",
      "model": "GX-2024",
      "weight": 2.5,
      "dimensions": {
        "width": 35.6,
        "height": 2.5,
        "depth": 24.2
      },
      "ports": ["USB-C", "HDMI", "USB-A"]
    },
    "images": [
      "https://example.com/image1.jpg",
      "https://example.com/image2.jpg"
    ],
    "createdAt": "2024-01-01T00:00:00Z",
    "updatedAt": "2024-01-15T10:30:00Z"
  }
}

Generated JSON Schema:

{
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "type": "object",
  "title": "Product Schema",
  "properties": {
    "product": {
      "type": "object",
      "properties": {
        "id": {
          "type": "string",
          "pattern": "^prod_[a-zA-Z0-9]+$"
        },
        "name": {
          "type": "string",
          "minLength": 1,
          "maxLength": 100
        },
        "description": {
          "type": "string",
          "maxLength": 1000
        },
        "price": {
          "type": "number",
          "minimum": 0,
          "multipleOf": 0.01
        },
        "currency": {
          "type": "string",
          "enum": ["USD", "EUR", "GBP", "JPY"]
        },
        "inStock": {
          "type": "boolean"
        },
        "categories": {
          "type": "array",
          "items": {
            "type": "string",
            "minLength": 1
          },
          "minItems": 1,
          "uniqueItems": true
        },
        "specifications": {
          "type": "object",
          "properties": {
            "brand": {"type": "string"},
            "model": {"type": "string"},
            "weight": {
              "type": "number",
              "minimum": 0
            },
            "dimensions": {
              "type": "object",
              "properties": {
                "width": {"type": "number", "minimum": 0},
                "height": {"type": "number", "minimum": 0},
                "depth": {"type": "number", "minimum": 0}
              },
              "required": ["width", "height", "depth"],
              "additionalProperties": false
            },
            "ports": {
              "type": "array",
              "items": {
                "type": "string",
                "enum": ["USB-C", "USB-A", "HDMI", "Ethernet", "Audio"]
              },
              "uniqueItems": true
            }
          },
          "required": ["brand", "model"],
          "additionalProperties": false
        },
        "images": {
          "type": "array",
          "items": {
            "type": "string",
            "format": "uri"
          },
          "minItems": 1
        },
        "createdAt": {
          "type": "string",
          "format": "date-time"
        },
        "updatedAt": {
          "type": "string",
          "format": "date-time"
        }
      },
      "required": ["id", "name", "price", "currency", "inStock"],
      "additionalProperties": false
    }
  },
  "required": ["product"],
  "additionalProperties": false
}

Advanced Schema Features

1. Conditional Validation

{
  "type": "object",
  "properties": {
    "userType": {"type": "string", "enum": ["admin", "user"]},
    "permissions": {"type": "array"}
  },
  "if": {
    "properties": {"userType": {"const": "admin"}}
  },
  "then": {
    "properties": {
      "permissions": {
        "contains": {"const": "admin"}
      }
    }
  },
  "else": {
    "properties": {
      "permissions": {
        "not": {"contains": {"const": "admin"}}
      }
    }
  }
}

2. Pattern Properties

{
  "type": "object",
  "patternProperties": {
    "^[a-zA-Z0-9]+_config$": {
      "type": "object",
      "properties": {
        "enabled": {"type": "boolean"},
        "value": {"type": "string"}
      }
    }
  },
  "additionalProperties": false
}

3. Definitions and References

{
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "type": "object",
  "properties": {
    "users": {
      "type": "array",
      "items": {"$ref": "#/$defs/user"}
    }
  },
  "$defs": {
    "user": {
      "type": "object",
      "properties": {
        "id": {"type": "integer"},
        "name": {"type": "string"},
        "email": {"type": "string", "format": "email"}
      },
      "required": ["id", "name", "email"]
    }
  }
}

Best Practices

1. Infer Constraints from Data

// From multiple JSON examples
[
  {"age": 25, "score": 85},
  {"age": 30, "score": 92},
  {"age": 22, "score": 78}
]

// Infer realistic constraints
{
  "type": "object",
  "properties": {
    "age": {
      "type": "integer",
      "minimum": 18,    // Inferred minimum
      "maximum": 65     // Inferred maximum
    },
    "score": {
      "type": "integer",
      "minimum": 0,     // Logical minimum
      "maximum": 100    // Logical maximum
    }
  }
}

2. Use Appropriate Formats

{
  "properties": {
    "email": {"type": "string", "format": "email"},
    "website": {"type": "string", "format": "uri"},
    "birthday": {"type": "string", "format": "date"},
    "timestamp": {"type": "string", "format": "date-time"},
    "ipAddress": {"type": "string", "format": "ipv4"}
  }
}

3. Document Your Schema

{
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "title": "User Registration Schema",
  "description": "Schema for validating user registration data",
  "type": "object",
  "properties": {
    "username": {
      "type": "string",
      "description": "Unique username for the account",
      "minLength": 3,
      "maxLength": 20,
      "examples": ["johndoe", "alice123"]
    }
  }
}

4. Handle Optional vs Required Fields

{
  "type": "object",
  "properties": {
    "name": {"type": "string"},      // Required
    "email": {"type": "string"},     // Required
    "phone": {"type": "string"},     // Optional
    "bio": {"type": "string"}        // Optional
  },
  "required": ["name", "email"],      // Only essential fields
  "additionalProperties": false
}

Validation Examples

Schema Validation in JavaScript

const Ajv = require('ajv');
const addFormats = require('ajv-formats');

const ajv = new Ajv();
addFormats(ajv);

const schema = {
  "type": "object",
  "properties": {
    "name": {"type": "string", "minLength": 1},
    "age": {"type": "integer", "minimum": 0}
  },
  "required": ["name", "age"]
};

const validate = ajv.compile(schema);

// Valid data
const validData = {"name": "John", "age": 30};
console.log(validate(validData)); // true

// Invalid data
const invalidData = {"name": "", "age": -5};
console.log(validate(invalidData)); // false
console.log(validate.errors);

Tools and Libraries

Online Schema Generators

  • JSONSchema.net - Generate schemas from JSON
  • QuickType - Generate schemas and types
  • Transform.tools - Various JSON transformations

Programming Libraries

JavaScript

// Generate schema from JSON
const generateSchema = require('generate-schema');
const schema = generateSchema.json('User', jsonData);

Python

# Generate schema from JSON
from genson import SchemaBuilder

builder = SchemaBuilder()
builder.add_object(json_data)
schema = builder.to_schema()

Command Line

# Using various CLI tools
npx quicktype --lang schema --src user.json
python -m genson sample.json

Common Use Cases

1. API Documentation

  • Document request/response formats
  • Generate interactive API docs
  • Validate API contracts

2. Data Validation

  • Form validation
  • Data import validation
  • Configuration file validation

3. Code Generation

  • Generate TypeScript interfaces
  • Create data models
  • Generate validation code

4. Testing

  • Validate test fixtures
  • Generate test data
  • Contract testing

Conclusion

Converting JSON to JSON Schema transforms concrete data examples into reusable validation and documentation tools. This process involves:

  • Analyzing data patterns to infer types and constraints
  • Adding validation rules for data integrity
  • Creating reusable schemas for consistent validation
  • Documenting data structures for better understanding

JSON Schema provides powerful validation capabilities including:

  • Type validation for all JSON data types
  • Constraint validation for ranges, lengths, and patterns
  • Structural validation for required fields and object shapes
  • Conditional validation for complex business rules

By converting JSON examples to JSON Schema, you create a foundation for robust data validation, clear API documentation, and automated code generation that improves data quality and developer experience.