Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
dabda81
Generalize prettyPrint, add parseHumanReadable and test
blambov Nov 29, 2022
5b91a01
Rename increaseSlightly to nextValidToken
blambov Dec 21, 2022
712527a
Rename sstableComparator to firstKeyComparator
blambov Mar 6, 2023
1318f1d
Encapsulate SSTableReader first and last
blambov Mar 15, 2023
e814929
Introduce CFS.getKeyspaceName()
blambov Mar 16, 2023
3be1fd4
Use SSTableReader.getMaxLocalDeletionTime
blambov Mar 17, 2023
f9a2d0e
Remove unused version of SSTableMultiWriter.finish
blambov Mar 17, 2023
a0d68b6
Fix code repetition in compaction writers
blambov Feb 27, 2023
1b06396
CASSANDRA-18123: Fix invalid reuse of metadata collector during flushing
blambov Jan 5, 2023
d4cf1f3
Fix weighted splitting
blambov Mar 20, 2023
e5582c3
Add flushSizeOnDisk metric
blambov Feb 28, 2023
fec6e8b
UnifiedCompactionStrategy, CEP-26
blambov Feb 27, 2023
955b73e
Put tokenSpaceCoverage in sstable metadata
blambov Nov 10, 2022
c095067
Work around CASSANDRA-18342
blambov Mar 17, 2023
9197ba7
Adds ability to change compaction default in YAML
blambov Feb 9, 2023
e2295f4
Trivial corrections
blambov Jun 16, 2023
13a71ae
Test and fix shard number calculation for multiple nodes/disks
blambov Jun 21, 2023
b2835f2
Set max_sstables_to_compact default to 32
blambov Jun 23, 2023
41defe6
Fix header
blambov Jun 26, 2023
6447708
Review comments
blambov Jun 26, 2023
fa86c79
Review comments
blambov Jun 28, 2023
08384a9
Switch to K instead of k for kilo
blambov Jul 6, 2023
a38d50f
Tweak flaky test for more information
blambov Jul 5, 2023
6b7e755
Fix flaky DiskSpaceMetricsTest
blambov Jul 18, 2023
58ba8f1
review nits
blambov Jul 18, 2023
f586649
DO NOT COMMIT: CircleCI config
blambov Jul 18, 2023
920022c
Added CHANGES.txt and NEWS.txt entries
blambov Jul 19, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1,103 changes: 680 additions & 423 deletions .circleci/config.yml

Large diffs are not rendered by default.

1 change: 1 addition & 0 deletions CHANGES.txt
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
5.0
* Implementation of the Unified Compaction Strategy as described in CEP-26 (CASSANDRA-18397)
* Upgrade commons cli to 1.5.0 (CASSANDRA-18659)
* Disable the deprecated keyspace/table thresholds and convert them to guardrails (CASSANDRA-18617)
* Deprecate CloudstackSnitch and remove duplicate code in snitches (CASSANDRA-18438)
Expand Down
4 changes: 4 additions & 0 deletions NEWS.txt
Original file line number Diff line number Diff line change
Expand Up @@ -71,6 +71,10 @@ using the provided 'sstableupgrade' tool.

New features
------------
- Added a new "unified" compaction strategy that supports the use cases of the legacy compaction strategies, with
low space overhead, high parallelism and flexible configuration. Implemented by the UnifiedCompactionStrategy
class. Further details and documentation can be found in
src/java/org/apache/cassandra/db/compaction/UnifiedCompactionStrategy.md
- New `VectorType` (cql `vector<element_type, dimension>`) which adds new fixed-length element arrays. See CASSANDRA-18504
- Removed UDT type migration logic for 3.6+ clusters upgrading to 4.0. If migration has been disabled, it must be
enabled before upgrading to 5.0 if the cluster used UDTs. See CASSANDRA-18504
Expand Down
14 changes: 14 additions & 0 deletions conf/cassandra.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -1002,6 +1002,20 @@ snapshot_links_per_second: 0
# Min unit: KiB
column_index_cache_size: 2KiB

# Default compaction strategy, applied when a table's parameters do not
# specify compaction.
# The selected compaction strategy will also apply to system tables.
#
# The default is to use SizeTieredCompactionStrategy, with its default
# compaction parameters.
#
# default_compaction:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As this is a newly introduced configuration, so I think we should give a more detailed description abouth the usage, like :
class_name should be the different compaction strategy name , parameters sholud be the maps of differen parameters that different compaction strategy used. Then the next is an example

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I started changing this to:

# Default compaction strategy, applied when a table's parameters do not
# specify compaction. A compaction "class_name" must be given, and optionally
# its class-specific configuration parameters under "parameters" as in the
# example below.
# The selected compaction strategy will also apply to system tables.
#
# The default is to use SizeTieredCompactionStrategy, with its default
# compaction parameters.

but then realized this is out of line with other such items in the YAML, which give the configuration by example only. E.g. commitlog_compression, hints_compression are described similarly and go through the same transformation (class among other parameters in the schema to class_name and parameters fields in the YAML).

I'm leaning towards keeping it short and in line with other items, but I'm happy to change it if you prefer.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok,then I think keep same with "commitlog_compression" or "hints_compression" is enough.

# class_name: UnifiedCompactionStrategy
# parameters:
# scaling_parameters: T4
# target_sstable_size: 1GiB


# Number of simultaneous compactions to allow, NOT including
# validation "compactions" for anti-entropy repair. Simultaneous
# compactions can help preserve read performance in a mixed read/write
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,7 @@
import org.apache.cassandra.db.virtual.LogMessagesTable;
import org.apache.cassandra.exceptions.ConfigurationException;
import org.apache.cassandra.service.FileSystemOwnershipCheck;
import org.apache.cassandra.utils.FBUtilities;
import org.apache.cassandra.utils.StorageCompatibilityMode;

// checkstyle: suppress below 'blockSystemPropertyUsage'
Expand Down Expand Up @@ -516,6 +517,12 @@ public enum CassandraRelevantProperties
TRIGGERS_DIR("cassandra.triggers_dir"),
TRUNCATE_BALLOT_METADATA("cassandra.truncate_ballot_metadata"),
TYPE_UDT_CONFLICT_BEHAVIOR("cassandra.type.udt.conflict_behavior"),
// See org.apache.cassandra.db.compaction.unified.Controller for the definition of the UCS parameters
UCS_BASE_SHARD_COUNT("unified_compaction.base_shard_count", "4"),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need to add some descriptions for these UCS configuration?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a reference to Controller, where they are defined and described.

UCS_OVERLAP_INCLUSION_METHOD("unified_compaction.overlap_inclusion_method"),
UCS_SCALING_PARAMETER("unified_compaction.scaling_parameters", "T4"),
UCS_SURVIVAL_FACTOR("unified_compaction.survival_factor", "1"),
UCS_TARGET_SSTABLE_SIZE("unified_compaction.target_sstable_size", "1GiB"),
UDF_EXECUTOR_THREAD_KEEPALIVE_MS("cassandra.udf_executor_thread_keepalive_ms", "30000"),
UNSAFE_SYSTEM("cassandra.unsafesystem"),
/** User's home directory. */
Expand Down Expand Up @@ -725,6 +732,56 @@ public long getLong(long overrideDefaultValue)
return LONG_CONVERTER.convert(value);
}

/**
* Gets the value of a system property as a double.
* @return System property value if it exists, defaultValue otherwise. Throws an exception if no default value is set.
*/
public double getDouble()
{
String value = System.getProperty(key);
if (value == null && defaultVal == null)
throw new ConfigurationException("Missing property value or default value is not set: " + key);
return DOUBLE_CONVERTER.convert(value == null ? defaultVal : value);
}

/**
* Gets the value of a system property as a double.
* @return system property value if it exists, defaultValue otherwise.
*/
public double getDouble(double overrideDefaultValue)
{
String value = System.getProperty(key);
if (value == null)
return overrideDefaultValue;

return DOUBLE_CONVERTER.convert(value);
}

/**
* Gets the value of a system property, given as a human-readable size in bytes (e.g. 100MiB, 10GB, 500B).
* @return System property value if it exists, defaultValue otherwise. Throws an exception if no default value is set.
*/
public long getSizeInBytes()
{
String value = System.getProperty(key);
if (value == null && defaultVal == null)
throw new ConfigurationException("Missing property value or default value is not set: " + key);
return SIZE_IN_BYTES_CONVERTER.convert(value == null ? defaultVal : value);
}

/**
* Gets the value of a system property, given as a human-readable size in bytes (e.g. 100MiB, 10GB, 500B).
* @return System property value if it exists, defaultValue otherwise.
*/
public long getSizeInBytes(long overrideDefaultValue)
{
String value = System.getProperty(key);
if (value == null)
return overrideDefaultValue;

return SIZE_IN_BYTES_CONVERTER.convert(value);
}

/**
* Gets the value of a system property as an int.
* @return system property int value if it exists, overrideDefaultValue otherwise.
Expand Down Expand Up @@ -847,6 +904,32 @@ public interface PropertyConverter<T>
}
};

private static final PropertyConverter<Long> SIZE_IN_BYTES_CONVERTER = value ->
{
try
{
return FBUtilities.parseHumanReadableBytes(value);
}
catch (ConfigurationException e)
{
throw new ConfigurationException(String.format("Invalid value for system property: " +
"expected size in bytes with unit but got '%s'\n%s", value, e));
}
};

private static final PropertyConverter<Double> DOUBLE_CONVERTER = value ->
{
try
{
return Double.parseDouble(value);
}
catch (NumberFormatException e)
{
throw new ConfigurationException(String.format("Invalid value for system property: " +
"expected floating point value but got '%s'", value));
}
};

/**
* @return whether a system property is present or not.
*/
Expand Down
5 changes: 5 additions & 0 deletions src/java/org/apache/cassandra/config/Config.java
Original file line number Diff line number Diff line change
Expand Up @@ -1098,6 +1098,11 @@ public enum PaxosOnLinearizabilityViolation
public volatile long min_tracked_partition_tombstone_count = 5000;
public volatile boolean top_partitions_enabled = true;

/**
* Default compaction configuration, used if a table does not specify any.
*/
public ParameterizedClass default_compaction = null;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. I think the parameter name should be changed to default_compaction_for_table,or something else.
  2. we should add some detailed description in yaml(But I do not saw any information change in cassandra.yaml in this patch), because we may modify the user's default behavior, which needs to be perceived by the user. In the old way if a use create a table without specify the compaction strategy , the STCS is used, but now the strategy may changed if this paramter has been set to other compaction strategy.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added an entry in cassandra.yaml.

The flag specifies a default value for the compaction parameter. Could you elaborate on why it should be renamed?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The flag specifies a default value for the compaction parameter. Could you elaborate on why it should be renamed?

Actually, it feels like there's no need to change it


public static Supplier<Config> getOverrideLoadConfig()
{
return overrideLoadConfig;
Expand Down
5 changes: 5 additions & 0 deletions src/java/org/apache/cassandra/config/DatabaseDescriptor.java
Original file line number Diff line number Diff line change
Expand Up @@ -4751,4 +4751,9 @@ public static StorageCompatibilityMode getStorageCompatibilityMode()
else
return conf.storage_compatibility_mode;
}

public static ParameterizedClass getDefaultCompaction()
{
return conf != null ? conf.default_compaction : null;
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@ public AbstractCompactionController(final ColumnFamilyStore cfs, final long gcBe

public String getKeyspace()
{
return cfs.keyspace.getName();
return cfs.getKeyspaceName();
}

public String getColumnFamily()
Expand Down
Loading