# Query

## **常見名詞**

|   Node   | 具有ES功能的Server |
| :------: | :-----------: |
|  Cluster |     很多Node    |
|   Index  |       DB      |
|   Type   |     Table     |
|   Field  |  欄位(Columns)  |
| Document |   一筆資料(Row)   |
|   Shard  |   組成Index的部分  |

## **Index建立、文章新增**

```python
#連線server
es=Elasticsearch([{'host': 'localhost', 'port': 9200}])

#如果需要身分驗證
es=Elasticsearch(['ip&port'],http_auth=('username','password'))


#建立index
#若連線狀態是ignore則回傳錯誤訊息
#建立時可在body加上表格欄位設定(table_format)，也就是mapping設定
#其中article即doc_type
table_format={
        "mappings":{
            "article":{
                "properties":{
                    "title":{
                        "type":"string",
                        "index":"not_analyzed"
                    },
                    "content" : {
                        "type":"string",
                        "index":"not_analyzed"
                    }
                }
            }
        }
    }
res=es.indices.create(index=index_name, ignore=[400,409,404], body=table_format)


#判斷index是否存在，會回傳boolean
res=es.indices.exists(index='index_name')


#新增資料至doc_type中
#id可以是任意字串
# index :若已存在則覆蓋
# create:若已存在則error
res=es.create(index='index_name', doc_type='article', id='aid', body=data)
res=es.index(index='index_name', doc_type='article', id='aid', body=data)
```

## mapping 映射

資料型態

text vs keyword

text：結構較複雜的資料，如：文章、句子。適用於full-text search

keyword：結構較簡單的資料，如：email、品牌型號。適用完全比對的需求

## score 相關性

主要根據TF/IDF!? (gitbook 7-3, 8-3)

影響因素：字串與字串之間的距離；字串出現的次數；配對成功的句子長度

## 查詢語法

## match

回傳結果包含\_score排名(怪怪ㄉ)，英文大小寫**有影響**

若搜尋字串無法完全比對，會根據**部分搜尋字串**進行比對，也就是搜尋字串會被拆解

```python
#gitbook 7-4
query={
    'query': {
      'match': {
        'about':'Sony'
       }
    }
}
res=es.search(index='indexname',doc_type='typename',body=query) #搜尋about欄位包含sony的結果
```

## match\_phrase

結果同樣包含\_score排名，會**完全比對**搜尋字串，也就是欄位中一定要出現"很好用"

如資料以陣列儲存則需要與陣列元素完全相同

```python
#gitbook 7-4
qs={
    "query" : {
        "match_phrase" : {
            "about" : "很好用"
        }
    }
}
```

## term

完全比對日期、數值與字串，比對字串記得將mapping中的字串欄位改為"index":"not\_analyzed"

```python
#gitbook 7-4
qs={
    "query" : {
        "term" : {
            "interests" : "workout"
        }
    }
}
```

## terms

可使用多個比對對條件，符合其中一個即可

```python
#gitbook 7-4
#interest包含workout'或'jogging
qs={
    "query" : {
        "terms" : {
            "interests" : ["workout","jogging"]
        }
    }
}
```

## aggregations

```python
#將interests的內容進行聚合
qs={
  "aggs": {
    "search_all_interests": { #query名稱
      "terms": { "field": "interests" }
    }
  }
}

#聚合first_name叫做Pan的員工的所有interest
qs={
  "query": {
    "match": {
      "first_name": "Pan"
    }
  },
  "aggs": {
    "all_interests": {
      "terms": {
        "field": "interests"
      }
    }
  }
}
```

## 組合查詢

must：and

must\_not：not

should：or

```python
#gitbook 12-2
#interests中不包含sony的結果
qs={
  "query" : {
      "bool":{
        "must_not":{
           "term":{
             "interests":"sony"
           }
         }
      }
   }
}


#與aggs結合，聚合interests中非sony的所有結果
qs={
  "query" : {
    "bool":{
      "must_not":{
        "term":{
          "interests":"sony"
        }
      }
    }
  },
  "aggs": {
    "all_interests": {
      "terms": {
        "field": "interests"
      }
    }
  }
}
```

## Range

| gt : 大於 | gte：大於等於 |
| :-----: | :------: |
|  lt：小於  | lte：小於等於 |

```python
#gitbook 12-5
#不用加filter
#price介於2000~5000
{
    "range" : {
        "price" : { #欄位名稱
            "gte" : 5000, #大於等於
            "lt" : 20000  #小於等於
        }
    }
}
```

### 巢狀判斷式

<https://www.elastic.co/blog/lost-in-translation-boolean-operations-and-filters-in-the-bool-query>

判斷是否連線成功

```python
if not es.ping():
    raise ValueError("Connection failed")
```

api doc：<http://elasticsearch-py.readthedocs.io/en/master/api.html#elasticsearch>

中文GITBOOK：<https://es.xiaoleilu.com/index.html>


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://stb11816.gitbook.io/python_note/database/elasticsearch/python.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
