Skip to content

Conversation

@masseyke
Copy link
Member

@masseyke masseyke commented Nov 20, 2024

As described in #116497, A new ignored_fields array is returned if ingest would ignore any fields in the input. For example:

curl -X POST "localhost:9200/_ingest/_simulate?pretty" -H 'Content-Type: application/json' -d'
{
  "docs": [
    {        
      "_index": "simulate-test",
      "_id": "y9Es_JIBiw6_GgN-U0qy",
      "_score": 1,
      "_source": {
        "abc": "sfdsfsfdsfsfdsfsfdsfsfdsfsfdsf"
      }    
    }      
  ],
  "index_template_substitutions": {
    "ind_temp": {
      "index_patterns": ["simulate-test"],
      "composed_of": ["simulate-test"]
    }    
  },
  "component_template_substitutions": {
    "simulate-test": {
      "template": {
        "mappings": {
          "dynamic": false,
          "properties": {
            "abc": {
              "type": "keyword",
              "ignore_above": 1
            }
          }  
        }    
      }    
    }      
  }
}
'
{
  "docs" : [
    {
      "doc" : {
        "_id" : "y9Es_JIBiw6_GgN-U0qy",
        "_index" : "simulate-test",
        "_version" : -3,
        "_source" : {
          "abc" : "sfdsfsfdsfsfdsfsfdsfsfdsfsfdsf"
        },
        "executed_pipelines" : [ ],
        "ignored_fields" : [
          {
            "field": "abc"
          }
        ]
      }
    }
  ]
}

@masseyke masseyke added the :Data Management/Ingest Node Execution or management of Ingest Pipelines including GeoIP label Nov 20, 2024
@masseyke masseyke added >enhancement v8.18.0 auto-backport Automatically create backport pull requests when merged labels Dec 17, 2024
@elasticsearchmachine
Copy link
Collaborator

Hi @masseyke, I've created a changelog YAML for you.

@masseyke masseyke marked this pull request as ready for review December 17, 2024 20:25
@masseyke masseyke requested a review from dakrone December 17, 2024 20:26
@elasticsearchmachine elasticsearchmachine added the Team:Data Management Meta label for data/management team label Dec 17, 2024
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-data-management (Team:Data Management)

Copy link
Member

@dakrone dakrone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks Keith!

* This creates a temporary index with the mappings of the index in the request, and then attempts to index the source from the request
* into it. If there is a mapping exception, that exception is returned. On success the returned exception is null.
* @parem componentTemplateSubstitutions The component template definitions to use in place of existing ones for validation
* @param componentTemplateSubstitutions The component template definitions to use in place of existing ones for validation
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hah, nice catch :)

});
final Collection<String> ignoredFields;
if (result == null) {
ignoredFields = List.of();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could probably just do return List.of(); here and avoid having to create the ignoredFields local field? It's not a big deal either way though.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I kind of prefer having a single return statement rather than 3 -- it makes debugging easier for me.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm okay either way :)

ignoredFields = List.of();
} else {
List<LuceneDocument> luceneDocuments = result.parsedDoc().docs();
if (luceneDocuments != null && luceneDocuments.size() == 1) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we add an assert luceneDocuments().size() == 1 somewhere here to ensure that we fail if in the future a single index request results in more than one doc? (We'd silently ignored the response if we didn't)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good.

@masseyke masseyke merged commit 43e6fad into elastic:main Dec 23, 2024
15 of 16 checks passed
@masseyke masseyke deleted the simulate-ingest-return-ignored-fields branch December 23, 2024 21:53
masseyke added a commit to masseyke/elasticsearch that referenced this pull request Dec 23, 2024
@elasticsearchmachine
Copy link
Collaborator

💚 Backport successful

Status Branch Result
8.x

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

auto-backport Automatically create backport pull requests when merged :Data Management/Ingest Node Execution or management of Ingest Pipelines including GeoIP >enhancement Team:Data Management Meta label for data/management team v8.18.0 v9.0.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants