opensearch-project · kylehounslow · Dec 15, 2025
@@ -1,34 +1,38 @@
-# ad (deprecated by ml command)  
+# ad (deprecated by ml command)
 
-## Description  
 
-The `ad` command applies Random Cut Forest (RCF) algorithm in the ml-commons plugin on the search result returned by a PPL command. Based on the input, the command uses two types of RCF algorithms: fixed-in-time RCF for processing time-series data, batch RCF for processing non-time-series data.
-## Syntax  
+The `ad` command applies Random Cut Forest (RCF) algorithm in the ml-commons plugin on the search results returned by a PPL command. Based on the input, the command uses two types of RCF algorithms: fixed-in-time RCF for processing time-series data, batch RCF for processing non-time-series data.
 
-## Fixed In Time RCF For Time-series Data  
+## Syntax
 
-ad [number_of_trees] [shingle_size] [sample_size] [output_after] [time_decay] [anomaly_rate] \<time_field\> [date_format] [time_zone] [category_field]
-* number_of_trees: optional. Number of trees in the forest. **Default:** 30.  
-* shingle_size: optional. A shingle is a consecutive sequence of the most recent records. **Default:** 8.  
-* sample_size: optional. The sample size used by stream samplers in this forest. **Default:** 256.  
-* output_after: optional. The number of points required by stream samplers before results are returned. **Default:** 32.  
-* time_decay: optional. The decay factor used by stream samplers in this forest. **Default:** 0.0001.  
-* anomaly_rate: optional. The anomaly rate. **Default:** 0.005.  
-* time_field: mandatory. Specifies the time field for RCF to use as time-series data.  
-* date_format: optional. Used for formatting time_field. **Default:** "yyyy-MM-dd HH:mm:ss".  
-* time_zone: optional. Used for setting time zone for time_field. **Default:** "UTC".  
-* category_field: optional. Specifies the category field used to group inputs. Each category will be independently predicted.  
+The following sections describe the syntax for each RCF algorithm type.
+
+## Fixed in time RCF for time-series data
+
+`ad [number_of_trees] [shingle_size] [sample_size] [output_after] [time_decay] [anomaly_rate] <time_field> [date_format] [time_zone] [category_field]`
+* `number_of_trees`: optional. Number of trees in the forest. **Default:** 30.  
+* `shingle_size`: optional. A shingle is a consecutive sequence of the most recent records. **Default:** 8.  
+* `sample_size`: optional. The sample size used by stream samplers in this forest. **Default:** 256.  
+* `output_after`: optional. The number of points required by stream samplers before results are returned. **Default:** 32.  
+* `time_decay`: optional. The decay factor used by stream samplers in this forest. **Default:** 0.0001.  
+* `anomaly_rate`: optional. The anomaly rate. **Default:** 0.005.  
+* `time_field`: mandatory. Specifies the time field for RCF to use as time-series data.  
+* `date_format`: optional. Used for formatting time_field. **Default:** "yyyy-MM-dd HH:mm:ss".  
+* `time_zone`: optional. Used for setting time zone for time_field. **Default:** "UTC".  
+* `category_field`: optional. Specifies the category field used to group inputs. Each category will be independently predicted.  
 
-## Batch RCF For Non-time-series Data  
 
-ad [number_of_trees] [sample_size] [output_after] [training_data_size] [anomaly_score_threshold] [category_field]
-* number_of_trees: optional. Number of trees in the forest. **Default:** 30.  
-* sample_size: optional. Number of random samples given to each tree from the training data set. **Default:** 256.  
-* output_after: optional. The number of points required by stream samplers before results are returned. **Default:** 32.  
-* training_data_size: optional. **Default:** size of your training data set.  
-* anomaly_score_threshold: optional. The threshold of anomaly score. **Default:** 1.0.  
-* category_field: optional. Specifies the category field used to group inputs. Each category will be independently predicted.  
+## Batch RCF for non-time-series data
+
+`ad [number_of_trees] [sample_size] [output_after] [training_data_size] [anomaly_score_threshold] [category_field]`
+* `number_of_trees`: optional. Number of trees in the forest. **Default:** 30.  
+* `sample_size`: optional. Number of random samples given to each tree from the training dataset. **Default:** 256.  
+* `output_after`: optional. The number of points required by stream samplers before results are returned. **Default:** 32.  
+* `training_data_size`: optional. **Default:** size of your training dataset.  
+* `anomaly_score_threshold`: optional. The threshold of anomaly score. **Default:** 1.0.  
+* `category_field`: optional. Specifies the category field used to group inputs. Each category will be independently predicted.  
 
+
 ## Example 1: Detecting events in New York City from taxi ridership data with time-series data  
 
 This example trains an RCF model and uses the model to detect anomalies in the time-series ridership data.
@@ -51,6 +55,7 @@ fetched rows / total rows = 1/1
 +---------+---------------------+-------+---------------+
 ```
 
+
 ## Example 2: Detecting events in New York City from taxi ridership data with time-series data independently with each category  
 
 This example trains an RCF model and uses the model to detect anomalies in the time-series ridership data with multiple category values.
@@ -74,6 +79,7 @@ fetched rows / total rows = 2/2
 +----------+---------+---------------------+-------+---------------+
 ```
 
+
 ## Example 3: Detecting events in New York City from taxi ridership data with non-time-series data  
 
 This example trains an RCF model and uses the model to detect anomalies in the non-time-series ridership data.
@@ -96,6 +102,7 @@ fetched rows / total rows = 1/1
 +---------+-------+-----------+
 ```
 
+
 ## Example 4: Detecting events in New York City from taxi ridership data with non-time-series data independently with each category  
 
 This example trains an RCF model and uses the model to detect anomalies in the non-time-series ridership data with multiple category values.
@@ -119,6 +126,7 @@ fetched rows / total rows = 2/2
 +----------+---------+-------+-----------+
 ```
 
+
 ## Limitations  
 
 The `ad` command can only work with `plugins.calcite.enabled=false`.
@@ -1,21 +1,22 @@
-# AddColTotals
+# addcoltotals
 
 
-# Description
 
-The `addcoltotals` command computes the sum of each column and add a summary event at the end to show the total of each column. This command works the same way `addtotals` command works with row=false and col=true option. This is useful for creating summary reports with subtotals or grand totals. The `addcoltotals` command only sums numeric fields (integers, floats, doubles). Non-numeric fields in the field list are ignored even if its specified in field-list or in the case of no field-list specified.
+The `addcoltotals` command computes the sum of each column and adds a summary event at the end to show the total of each column. This command works the same way `addtotals` command works with row=false and col=true option. This is useful for creating summary reports with subtotals or grand totals. The `addcoltotals` command only sums numeric fields (integers, floats, doubles). Non-numeric fields in the field list are ignored even if its specified in field-list or in the case of no field-list specified.
 
-# Syntax
+## Syntax
+
+Use the following syntax:
 
 `addcoltotals [field-list] [label=<string>] [labelfield=<field>]`
 
 - `field-list`: Optional. Comma-separated list of numeric fields to sum.  If not specified, all numeric fields are summed.
 - `labelfield=<field>`: Optional. Field name to place the label. If it  specifies a non-existing field, adds the field and shows label at the summary event row at this field.
 - `label=<string>`: Optional. Custom text for the totals row labelfield\'s label. Default is \"Total\".
 
-# Example 1: Basic Example
+# Example 1: Basic example
 
-The example shows placing the label in an existing field.
+The following example PPL query shows how to use `addcoltotals` to place the label in an existing field.
 
 ```ppl
 source=accounts 
@@ -38,9 +39,9 @@ fetched rows / total rows = 4/4
 +-----------+---------+
 ```
 
-# Example 2: Adding column totals and adding a summary event with label specified.
+# Example 2: Adding column totals and adding a summary event with label specified
 
-The example shows adding totals after a stats command where final summary event label is \'Sum\' and row=true value was used by default when not specified. It also added new field specified by labelfield as it did not match existing field.
+The following example PPL query shows how to use `addcoltotals` to add totals after a stats command where final summary event label is \'Sum\' and row=true value was used by default when not specified. It also added new field specified by labelfield as it did not match existing field.
 
 ```ppl
 source=accounts 
@@ -63,7 +64,7 @@ fetched rows / total rows = 3/3
 
 # Example 3: With all options
 
-The example shows using addcoltotals with all options set.
+The following example PPL query shows how to use `addcoltotals` with all options set.
 
 ```ppl
 source=accounts 

@@ -1,12 +1,13 @@
-# AddTotals
+# addtotals
 
 
-## Description
 
-The `addtotals` command computes the sum of numeric fields and appends a row with the totals to the result. The command can also add row totals and add a field to store row totals. This is useful for creating summary reports with subtotals or grand totals. The `addtotals` command only sums numeric fields (integers, floats, doubles). Non-numeric fields in the field list are ignored even if it\'s specified in field-list or in the case of no field-list specified.
+The `addtotals` command computes the sum of numeric fields and appends a row with the totals to the result. The command can also add row totals and add a field to store row totals. This is useful for creating summary reports with subtotals or grand totals. The `addtotals` command only sums numeric fields (integers, floats, doubles). Non-numeric fields in the field list are ignored even if it's specified in field-list or in the case of no field-list specified.
 
 ## Syntax
 
+Use the following syntax:
+
 `addtotals [field-list] [label=<string>] [labelfield=<field>] [row=<boolean>] [col=<boolean>] [fieldname=<field>]`
 
 - `field-list`: Optional. Comma-separated list of numeric fields to sum. If not specified, all numeric fields are summed.
@@ -16,9 +17,9 @@ The `addtotals` command computes the sum of numeric fields and appends a row wit
 - `label=<string>`: Optional. Custom text for the totals row labelfield\'s label. Default is \"Total\". This is applicable when col=true. This does not have any effect when labelfield and fieldname parameter both have same value.
 - `fieldname=<field>`: Optional. Calculates total of each row and add a new field to store this total. This is applicable when row=true.
 
-## Example 1: Basic Example
+## Example 1: Basic example
 
-The example shows placing the label in an existing field.
+The following example PPL query shows how to use `addtotals` to place the label in an existing field.
 
 ```ppl
 source=accounts 
@@ -41,9 +42,9 @@ fetched rows / total rows = 4/4
 +-----------+---------+-------+
 ```    
 
-## Example 2: Adding column totals and adding a summary event with label specified.
+## Example 2: Adding column totals and adding a summary event with label specified
 
-The example shows adding totals after a stats command where final summary event label is \'Sum\'. It also added new field specified by labelfield as it did not match existing field.
+The following example PPL query shows how to use `addtotals` to add totals after a stats command where final summary event label is \'Sum\'. It also added new field specified by labelfield as it did not match existing field.
 
 ```ppl
 source=accounts
@@ -66,7 +67,7 @@ fetched rows / total rows = 5/5
 +----------------+-----------+---------+-----+-------+
 ```
 
-if row=true in above example, there will be conflict between column added for column totals and column added for row totals being same field \'Total\', in that case the output will have final event row label null instead of \'Sum\' because the column is number type and it cannot output String in number type column. 
+if row=true in the preceding example, there will be conflict between column added for column totals and column added for row totals being same field \'Total\', in that case the output will have final event row label null instead of \'Sum\' because the column is number type and it cannot output String in number type column. 
 
 ```ppl
 source=accounts
@@ -91,7 +92,7 @@ fetched rows / total rows = 5/5
 
 ## Example 3: With all options
 
-The example shows using addtotals with all options set.
+The following example PPL query shows how to use `addtotals` with all options set.
 
 ```ppl
 source=accounts 

@@ -1,21 +1,26 @@
-# append  
+# append
 
-## Description  
 
-The `append` command appends the result of a sub-search and attaches it as additional rows to the bottom of the input search results (The main search).
+The `append` command appends the result of a sub-search and attaches it as additional rows to the bottom of the input search results (the main search).
+
 The command aligns columns with the same field names and types. For different column fields between the main search and sub-search, NULL values are filled in the respective rows.
-## Syntax  
 
-append \<sub-search\>
-* sub-search: mandatory. Executes PPL commands as a secondary search.  
+## Syntax
+
+Use the following syntax:
+
+`append <sub-search>`
+* `sub-search`: mandatory. Executes PPL commands as a secondary search.  
 
+
 ## Limitations  
 
 * **Schema Compatibility**: When fields with the same name exist between the main search and sub-search but have incompatible types, the query will fail with an error. To avoid type conflicts, ensure that fields with the same name have the same data type, or use different field names (e.g., by renaming with `eval` or using `fields` to select non-conflicting columns).  
 
-## Example 1: Append rows from a count aggregation to existing search result  
 
-This example appends rows from "count by gender" to "sum by gender, state".
+## Example 1: Append rows from a count aggregation to existing search results
+
+The following example appends rows from "count by gender" to "sum by gender, state".
 
 ```ppl
 source=accounts | stats sum(age) by gender, state | sort -`sum(age)` | head 5 | append [ source=accounts | stats count(age) by gender ]
@@ -37,9 +42,10 @@ fetched rows / total rows = 6/6
 +----------+--------+-------+------------+
 ```
 
-## Example 2: Append rows with merged column names  
 
-This example appends rows from "sum by gender" to "sum by gender, state" with merged column of same field name and type.
+## Example 2: Append rows with merged column names
+
+The following example appends rows from "sum by gender" to "sum by gender, state" with merged column of same field name and type.
 
 ```ppl
 source=accounts | stats sum(age) as sum by gender, state | sort -sum | head 5 | append [ source=accounts | stats sum(age) as sum by gender ]

@@ -1,15 +1,18 @@
-# appendcol  
+# appendcol
 
-## Description  
 
-The `appendcol` command appends the result of a sub-search and attaches it alongside with the input search results (The main search).
-## Syntax  
+The `appendcol` command appends the result of a sub-search and attaches it alongside the input search results (the main search).
 
-appendcol [override=\<boolean\>] \<sub-search\>
+## Syntax
+
+Use the following syntax:
+
+`appendcol [override=<boolean>] <sub-search>`
 * override=<boolean>: optional. Boolean field to specify should result from main-result be overwritten in the case of column name conflict. **Default:** false.  
-* sub-search: mandatory. Executes PPL commands as a secondary search. The sub-search uses the same data specified in the source clause of the main search results as its input.  
+* `sub-search`: mandatory. Executes PPL commands as a secondary search. The sub-search uses the same data specified in the source clause of the main search results as its input.  
 
-## Example 1: Append a count aggregation to existing search result  
+
+## Example 1: Append a count aggregation to existing search results  
 
 This example appends "count by gender" to "sum by gender, state".
 
@@ -40,7 +43,8 @@ fetched rows / total rows = 10/10
 +--------+-------+----------+------------+
 ```
 
-## Example 2: Append a count aggregation to existing search result with override option  
+
+## Example 2: Append a count aggregation to existing search results with override option  
 
 This example appends "count by gender" to "sum by gender, state" with override option.
 
@@ -71,9 +75,10 @@ fetched rows / total rows = 10/10
 +--------+-------+----------+------------+
 ```
 
+
 ## Example 3: Append multiple sub-search results  
 
-This example shows how to chain multiple appendcol commands to add columns from different sub-searches.
+The following example PPL query shows how to use `appendcol` to chain multiple appendcol commands to add columns from different sub-searches.
 
 ```ppl
 source=employees
@@ -101,9 +106,10 @@ fetched rows / total rows = 9/9
 +------+-------------+-----+------------------+---------+
 ```
 
+
 ## Example 4: Override case of column name conflict  
 
-This example demonstrates the override option when column names conflict between main search and sub-search.
+The following example PPL query demonstrates how to use `appendcol` with the override option when column names conflict between main search and sub-search.
 
 ```ppl
 source=employees

@@ -1,15 +1,18 @@
-# appendpipe  
+# appendpipe
 
-## Description  
 
-The `appendpipe` command appends the result of the subpipeline to the search results. Unlike a subsearch, the subpipeline is not run first.The subpipeline is run when the search reaches the appendpipe command.
+The `appendpipe` command appends the result of the subpipeline to the search results. Unlike a subsearch, the subpipeline is not run first. The subpipeline is run when the search reaches the appendpipe command.
 The command aligns columns with the same field names and types. For different column fields between the main search and sub-search, NULL values are filled in the respective rows.
-## Syntax  
 
-appendpipe [\<subpipeline\>]
-* subpipeline: mandatory. A list of commands that are applied to the search results from the commands that occur in the search before the `appendpipe` command.  
+## Syntax
+
+Use the following syntax:
+
+`appendpipe [<subpipeline>]`  
+* `subpipeline`: mandatory. A list of commands that are applied to the search results from the commands that occur in the search before the `appendpipe` command.  
 
-## Example 1: Append rows from a total count to existing search result  
+
+## Example 1: Append rows from a total count to existing search results  
 
 This example appends rows from "total by gender" to "sum by gender, state" with merged column of same field name and type.
 
@@ -37,6 +40,7 @@ fetched rows / total rows = 6/6
 +------+--------+-------+-------+
 ```
 
+
 ## Example 2: Append rows with merged column names  
 
 This example appends rows from "count by gender" to "sum by gender, state".
@@ -65,6 +69,7 @@ fetched rows / total rows = 6/6
 +----------+--------+-------+
 ```
 
+
 ## Limitations  
 
 * **Schema Compatibility**: Same as command `append`, when fields with the same name exist between the main search and sub-search but have incompatible types, the query will fail with an error. To avoid type conflicts, ensure that fields with the same name have the same data type, or use different field names (e.g., by renaming with `eval` or using `fields` to select non-conflicting columns).