# Elasticsearch API and Ingestion Pipeline

## Elasticsearch API

***`Task:`*** ***`Create a new index containing three documents. After creation, delete one document using the DELETE method and another using a query-based deletion. Then, reindex the remaining documents and perform an index flush.`***

First, let's open Kibana Dev Tools and initiate the creation of a new index.

<figure><img src="https://2537271824-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FIswWWP3l0rGuQmG2WUcr%2Fuploads%2FAOtFIXirFWKCCkNXqggR%2FScreenshot.png?alt=media&#x26;token=bfe19520-f73c-477e-bfbb-f23f46798707" alt=""><figcaption></figcaption></figure>

```json
PUT /weinnovate 
```

<figure><img src="https://2537271824-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FIswWWP3l0rGuQmG2WUcr%2Fuploads%2FtQ4iLdpc6ixG9H9E7YmK%2FScreenshot(1).png?alt=media&#x26;token=7d5c7ae0-c8cc-4972-8f36-c01fedec4423" alt=""><figcaption></figcaption></figure>

Now, let's add three documents to this index:

```json
POST /weinnovate/_doc/one
{
  "name": "Fares",
  "city": "Al Mahalla",
  "age": "23"
}

POST /weinnovate/_doc/two
{
  "name": "Omar",
  "city": "Cairo",
  "age": "23"
}

POST /weinnovate/_doc/three
{
  "name": "Ahmed",
  "city": "Mansoura",
  "age": "25"
}
```

<figure><img src="https://2537271824-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FIswWWP3l0rGuQmG2WUcr%2Fuploads%2FKKvPM2A0GNB5OlbVHopp%2FScreenshot(2).png?alt=media&#x26;token=9fa859c7-e0f5-40f5-b468-16d36cb25688" alt=""><figcaption></figcaption></figure>

Now, let's proceed with deleting the first document.

```json
DELETE /weinnovate/_doc/one
GET /weinnovate/_search
```

<figure><img src="https://2537271824-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FIswWWP3l0rGuQmG2WUcr%2Fuploads%2FDsjCD8TSzVjQshR4Uupl%2FScreenshot(3).png?alt=media&#x26;token=acc887a0-08ce-4bd1-9ce8-fcd3fa3a6b4a" alt=""><figcaption></figcaption></figure>

Next, let's try delete one document by query.

```json
POST /weinnovate/_delete_by_query
{
  "query":{
    "match":{
      "name": "Omar"
    }
  }
}

GET /weinnovate/_search
```

<figure><img src="https://2537271824-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FIswWWP3l0rGuQmG2WUcr%2Fuploads%2FVfGQTfZ827e1jm4Jjdz9%2FScreenshot(4).png?alt=media&#x26;token=5d377523-3995-47cb-b7e6-a958f0026058" alt=""><figcaption></figcaption></figure>

<figure><img src="https://2537271824-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FIswWWP3l0rGuQmG2WUcr%2Fuploads%2FPheto1QmEYahxDM2d4Q5%2FScreenshot(5).png?alt=media&#x26;token=e3b72c87-e604-46c8-92e9-ad6df25c0ceb" alt=""><figcaption></figcaption></figure>

We now need to reindex the documents, which enables us to copy them from one index to another. Before reindexing, we need to create a new index

```json
PUT weinnovate_new
```

<figure><img src="https://2537271824-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FIswWWP3l0rGuQmG2WUcr%2Fuploads%2F4XHjcApVY2xW98xqcJIH%2FScreenshot(6).png?alt=media&#x26;token=55a0e313-300e-4c33-b44e-c4b06b516aa6" alt=""><figcaption></figcaption></figure>

Now, let's reindex the data from **`weinnovate`** to **`weinnovate_new`**.

```json
POST _reindex
{
  "source": {
    "index": "weinnovate"
  },
  "dest": {
    "index": "weinnovate_new"
  }
}
```

<figure><img src="https://2537271824-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FIswWWP3l0rGuQmG2WUcr%2Fuploads%2FOnNIwHMpnm0Ksz3YpGw7%2FScreenshot(7).png?alt=media&#x26;token=ac9efb78-1d56-436c-987e-012f3f8023cc" alt=""><figcaption></figcaption></figure>

```json
GET /weinnovate_new/_search
```

<figure><img src="https://2537271824-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FIswWWP3l0rGuQmG2WUcr%2Fuploads%2FfrMaPWt3j7W2GeGmxwKG%2FScreenshot(8).png?alt=media&#x26;token=299e77f9-712b-4934-8d9a-c9724fca6930" alt=""><figcaption></figcaption></figure>

Flushing an index in Elasticsearch ensures that all operations are written to disk. This can help free up memory and optimize performance.

```json
POST weinnovate_new/_flush
```

<figure><img src="https://2537271824-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FIswWWP3l0rGuQmG2WUcr%2Fuploads%2FTu69bFfzHnryzfRaTNMK%2FScreenshot.png?alt=media&#x26;token=48a5028e-6470-4b17-b034-4f229c99507a" alt=""><figcaption></figcaption></figure>

## Ingestion Pipline

Ingest pipelines let you perform common transformations on your data before indexing. For example, you can use pipelines to remove fields, extract values from text, and enrich your data.

***`Task: Create a pipeline with three distinct processors for a new index.`***

First, we will create a pipeline named **`weinnovate_pipeline`**, which will include three processors.

```json
PUT _ingest/pipeline/weinnovate_pipline
{
  "description": "New Pipline for the weinnovate index",
  "processors": [
    {"append":{
      "field": "Status",
      "value": ["Active"]
      }
    }, 
    {"convert":{
      "field": "age",
      "type": "integer"
      }
    }, 
    {"uppercase":{
        "field": "name"
      }
    }
  ]
}
```

* The **`append`** processor adds a new value **`"Active"`** to the **`Status`** field.
* The **`convert`** processor converts the value of **`"age"`** to an **`integer`**.
* The **`uppercase`** processor converts the **`"name"`** field to **`uppercase`**.

<figure><img src="https://2537271824-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FIswWWP3l0rGuQmG2WUcr%2Fuploads%2Fh3KYoUSacGFR70hjnCE8%2FScreenshot(1).png?alt=media&#x26;token=1d32d329-4a52-4119-943f-8631e1e03ec6" alt=""><figcaption></figcaption></figure>

We need to confirm that the pipeline was created successfully.

```json
GET _ingest/pipeline/weinnovate_pipeline
```

<figure><img src="https://2537271824-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FIswWWP3l0rGuQmG2WUcr%2Fuploads%2FBIrMLwF9IRTXy7iBkIH7%2FScreenshot(2).png?alt=media&#x26;token=a416e2d9-cd98-4020-944e-872b20c88f1a" alt=""><figcaption></figcaption></figure>

Now, I want to apply this pipeline to the previously created **`"weinnovate"`** index. To do this, we first need to reindex the existing data while applying the pipeline during the reindexing process.

```json
POST _reindex
{
  "source": {
    "index": "weinnovate"
  },
  "dest": {
    "index": "weinnovate_two",  
    "pipeline": "weinnovate_pipline"
  }
}
```

<figure><img src="https://2537271824-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FIswWWP3l0rGuQmG2WUcr%2Fuploads%2FALkBzN9Ve0VHnz8TZbnR%2FScreenshot(3).png?alt=media&#x26;token=709dd050-5f6a-459f-9f95-8c682054bf32" alt=""><figcaption></figcaption></figure>

Let's verify this by retrieving the documents from the **`weinnovate_two`** index.

```json
GET /weinnovate_two/_search
```

<figure><img src="https://2537271824-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FIswWWP3l0rGuQmG2WUcr%2Fuploads%2FRaGpsqAAYscVVddMxKFY%2FScreenshot(4).png?alt=media&#x26;token=cc24df07-7102-4be7-82d5-2e1271519923" alt=""><figcaption></figcaption></figure>

Alternatively, we can first create the pipeline and then apply it when creating a new index.
