Getting started with Mahuta - A Search engine for the IPFS

Mahuta (formerly known as IPFS-Store) is a convenient library and API to aggregate and consolidate files or documents stored by your application on the IPFS network. It provides a solution to collect, store, index and search data used.

Features

Indexation: Mahuta stores documents or files on IPFS and index the hash with optional metadata.
Discovery: Documents and files indexed can be searched using complex logical queries or fuzzy/full text search)
Scalable: Optimised for large scale applications using asynchronous writing mechanism and caching
Replication: Replica set can be configured to replicate (pin) content across multiple nodes (standard IPFS node or IPFS-cluster node)
Multi-platform: Mahuta can be used as a simple embedded Java library for your JVM-based application or run as a simple, scalable and configurable Rest API.

Getting Started

These instructions will get you a copy of the project up and running on your local machine for development and testing purposes.

Prerequisites

Mahuta depends of two components: - an IPFS node (go or js implementation) - a search engine (currently only ElasticSearch is supported)

See how to run those two components first run IPFS and ElasticSearch

Java library

Import the Maven dependencies (core module + indexer)

<dependency>
    <groupId>net.consensys.mahuta</groupId>
    <artifactId>mahuta-core</artifactId>
    <version>${MAHUTA_VERSION}</version>
</dependency>
<dependency>
    <groupId>net.consensys.mahuta</groupId>
    <artifactId>mahuta-indexing-elasticsearch</artifactId>
    <version>${MAHUTA_VERSION}</version>
</dependency>

Configure Mahuta to connect to an IPFS node and an indexer

Mahuta mahuta = new MahutaFactory()
    .configureStorage(IPFSService.connect("localhost", 5001))
    .configureIndexer(ElasticSearchService.connect("localhost", 9300, "cluster-name"))
    .defaultImplementation();

Execute high-level operations

IndexingResponse response = mahuta.prepareStringIndexing("article", "## This is my first article")
    .contentType("text/markdown")
    .indexDocId("article-1")
    .indexFields(ImmutableMap.of("title", "First Article", "author", "greg"))
    .execute();

GetResponse response = mahuta.prepareGet()
    .indexName("article")
    .indexDocId("article-1")
    .loadFile(true)
    .execute();

SearchResponse response = mahuta.prepareSearch()
    .indexName("article")
    .query(Query.newQuery().equals("author", "greg"))
    .pageRequest(PageRequest.of(0, 20))
    .execute();

For more info, Mahuta Java API

Spring-Data

Import the Maven dependencies

<dependency>
    <groupId>net.consensys.mahuta</groupId>
    <artifactId>mahuta-springdata</artifactId>
    <version>${MAHUTA_VERSION}</version>
</dependency>

Configure your spring-data repository

@IPFSDocument(index = "article", indexConfiguration = "article_mapping.json", indexContent = true)
public class Article {

    @Id
    private String id;

    @Hash
    private String hash;

    @Fulltext
    private String title;

    @Fulltext
    private String content;

    @Indexfield
    private Date createdAt;

    @Indexfield
    private String createdBy;
}



public class ArticleRepository extends MahutaRepositoryImpl<Article, String> {

    public ArticleRepository(Mahuta mahuta) {
        super(mahuta);
    }
}

For more info, Mahuta Spring Data

HTTP API with Docker

Prerequisites

Docker
run IPFS and ElasticSearch with Docker

Docker

$ docker run -it --name mahuta \ 
    -p 8040:8040 \
    -e MAHUTA_IPFS_HOST=ipfs \
    -e MAHUTA_ELASTICSEARCH_HOST=elasticsearch \
    gjeanmart/mahuta

Docker Compose

Docker-compose sample file

Examples

To access the API documentation, go to Mahuta HTTP API

Create the index `article`

Sample Request:

curl -X POST \
  http://localhost:8040/mahuta/config/index/article \
  -H 'Content-Type: application/json'

Success Response:
- Code: 200
  Content:

{
    "status": "SUCCESS"
}

Store and index an article and its metadata

Sample Request:

curl -X POST \
  'http://localhost:8040/mahuta/index' \
  -H 'content-type: application/json' \
  -d '{"content":"# Hello world,\n this is my first file stored on **IPFS**","indexName":"article","indexDocId":"hello_world","contentType":"text/markdown","index_fields":{"title":"Hello world","author":"Gregoire Jeanmart","votes":10,"date_created":1518700549,"tags":["general"]}}'

Success Response:
- Code: 200
  Content:

{
  "indexName": "article",
  "indexDocId": "hello_world",
  "contentId": "QmWHR4e1JHMs2h7XtbDsS9r2oQkyuzVr5bHdkEMYiqfeNm",
  "contentType": "text/markdown",
  "pinned": true,
  "indexFields": {
    "title": "Hello world",
    "author": "Gregoire Jeanmart",
    "votes": 10,
    "createAt": 1518700549,
    "tags": [
      "general"
    ]
  },
  "status": "SUCCESS"
}

Search by query

Sample Request:

curl -X POST \
 'http://localhost:8040/mahuta/query/search?index=article' \
 -H 'content-type: application/json' \
 -d '{"query":[{"name":"title","operation":"CONTAINS","value":"Hello"},{"name":"author.keyword","operation":"EQUALS","value":"Gregoire Jeanmart"},{"name":"votes","operation":"GT","value":"5"}]}'

Success Response:
- Code: 200
  Content:

{
  "status": "SUCCESS",
  "page": {
    "pageRequest": {
      "page": 0,
      "size": 20,
      "sort": null,
      "direction": "ASC"
    },
    "elements": [
      {
        "metadata": {
          "indexName": "article",
          "indexDocId": "hello_world",
          "contentId": "Qmd6VkHiLbLPncVQiewQe3SBP8rrG96HTkYkLbMzMe6tP2",
          "contentType": "text/markdown",
          "pinned": true,
          "indexFields": {
            "author": "Gregoire Jeanmart",
            "votes": 10,
            "title": "Hello world",
            "createAt": 1518700549,
            "tags": ["general"]
          }
        }
      }
    ],
    "totalElements": 1,
    "totalPages": 1
  }
}

Kauri original title: Getting started with Mahuta - A Search engine for the IPFS
Kauri original link: https://kauri.io/getting-started-with-mahuta-a-search-engine-for-th/874b1fe11d00406bbbef053405fd4538/a
Kauri original author: Grégoire Jeanmart (@gregjeanmart)
Kauri original Publication date: 2019-12-10
Kauri original tags: ipfs-store, elasticsearch, ipfs, mahuta, search-engine, caching
Kauri original hash: QmbaS915QBSxes6g5WU5jFVZVuv4T13xxTVd4uSkWbnsrD
Kauri original checkpoint: Qmekp5iiDi5N5M4KdtAVGBEJEF3ahMgWYZJqL7s1qmkQ9g

Getting started with Mahuta - A Search engine for the IPFS

Features

Getting Started

Prerequisites

Java library

Spring-Data

HTTP API with Docker

Prerequisites

Docker

Docker Compose

Examples

Create the index article

Store and index an article and its metadata

Search by query

Create the index `article`