EmbJSON
Overview

CapybaraDB Extended JSON (EmbJSON)

CapybaraDB Extended JSON (EmbJSON) is a powerful data format designed to simplify database indexing pipelines, allowing developers to use a single database for most LLM applications. EmbJSON provides a versatile and efficient solution for handling data, making it easier to implement advanced AI features without the overhead of managing multiple databases. This streamlined approach enables developers to focus on building intelligent applications rather than dealing with complex data architecture.

Key Features of EmbJSON

  • Custom Embedding Models: Specify an embedding model using the emb_model parameter to fine-tune how your data is represented in vector space.
  • LLM Optimization: Built for efficient text embeddings and vector-based queries, EmbJSON is ideal for semantic search.
  • Flexible Indexing: Customize data indexing to optimize embedding and retrieval based on your specific needs.
  • Asynchronous Processing: All EmbJSON data types are processed asynchronously by default, allowing client applications to continue running smoothly while embedding and indexing take place in the background.

Overview of EmbJSON Data Types

  • EmbText: Designed for storing and embedding text data. EmbText can handle everything from single words to lengthy documents, embedding and indexing content automatically for semantic search.
  • EmbImage (Coming 2025): Will support image data, enabling semantic text search capabilities for images.
  • EmbVideo (Future Release): Will handle video data, enabling semantic embedding and search within video content.
  • EmbFile (Future Release): A generic type for managing a variety of file formats, including PDFs, Word documents, and more, allowing for semantic embedding and search across different types of content.
  • Other Data Types: Additional data types like EmbAudio and Emb3D are also planned for future releases, aimed at enabling semantic embeddings for audio files and 3D models.

Why Choose CapybaraDB Extended JSON?

  • Streamlined Embedding: Select and customize embedding models to suit your needs.
  • Optimized Data Retrieval: Efficient indexing makes semantic search fast and reliable.
  • Simple Data Management: Seamlessly manage complex data types like embeddings.
  • Unified Database Solution: EmbJSON abstracts the complexity of database architecture for LLM applications, meaning you only need one database for most of your AI needs.

To explore how EmbJSON can elevate your AI projects, check out the detailed documentation and guides available.

How can we improve this documentation?

Got question? Email us