Conversation
9fcc372 to
bccd92c
Compare
| DrainData DrainData `json:"type,omitempty"` | ||
| OmitMetadata bool | ||
| InternalTls bool | ||
| LogFilter *LogTypeSet |
There was a problem hiding this comment.
I don't think that this has to be a pointer.
There was a problem hiding this comment.
I tried that at first, but in manager.go line 45 Binding is used as a map key:
sourceDrainMap map[string]map[syslog.Binding]drainHolder
My understanding is that maps in Go are not comparable, so adding LogTypeSet (which is a map[LogType]struct{}) to the Binding struct makes it non-comparable.
AFAICS the alternative to using a pointer would be to implement a custom comparable key for the map, but using a pointer seemed simpler to me.
| default: | ||
| // Unknown log type, skip it | ||
| // TODO add unit test for unknown log type | ||
| d.log.Printf("Unknown log type '%s' in log type filter, ignoring", logType) |
There was a problem hiding this comment.
If there is a log message here and an app owner wrongly configures the drain, than for every app instance and every app log, a log ,message will be written.
The filter settings should be validated upfront and we should not end up in this situation. At this point we should only have valid drain configuration.
I guess we have to bring in #633 first and than this change.
There was a problem hiding this comment.
My understanding is that this code does not run for every log message, but only parsing the syslog drain config. I intentionally left logging out of filtering_drain_writer.go to avoid creating a log for each app log.
There was a problem hiding this comment.
After discussing with @chombium we concluded that yhis log gets created for each wrongly configured value in the log types of each syslog drain. This is emitted every few seconds on each VM within cloud foundry. While this is better than for each log line, we would still keep log load at a minimum.
To address this I now made sure to log at most one log line in this method.
| set := make(syslog.LogTypeSet, len(logTypes)) | ||
|
|
||
| for _, logType := range logTypes { | ||
| logType = strings.ToLower(strings.TrimSpace(logType)) |
There was a problem hiding this comment.
The query string in the URL is case sensitive. We should only accept lowercase and not convert eVerY POssibLE string to lower case.
There was a problem hiding this comment.
Did so.
Still, I wonder if we should be strict here about this. Not caring about casing might bring us some benefits as this might be something that users accidentally misconfigure. It should not hurt much to support both I think.
|
@chombium I left a bunch of replies for points that are not yet clear to me. All the rest look good. Thanks for the thorough review! You found a bunch of nice things. |
jorbaum
left a comment
There was a problem hiding this comment.
Worked in some comments.
| set := make(syslog.LogTypeSet, len(logTypes)) | ||
|
|
||
| for _, logType := range logTypes { | ||
| logType = strings.ToLower(strings.TrimSpace(logType)) |
There was a problem hiding this comment.
Did so.
Still, I wonder if we should be strict here about this. Not caring about casing might bring us some benefits as this might be something that users accidentally misconfigure. It should not hurt much to support both I think.
jorbaum
left a comment
There was a problem hiding this comment.
Worked in most comments. Still 2 left to clarify.
| default: | ||
| // Unknown log type, skip it | ||
| // TODO add unit test for unknown log type | ||
| d.log.Printf("Unknown log type '%s' in log type filter, ignoring", logType) |
There was a problem hiding this comment.
After discussing with @chombium we concluded that yhis log gets created for each wrongly configured value in the log types of each syslog drain. This is emitted every few seconds on each VM within cloud foundry. While this is better than for each log line, we would still keep log load at a minimum.
To address this I now made sure to log at most one log line in this method.
|
Feel free to take another look at it. |
108740d to
ef16bed
Compare
| return true | ||
| } | ||
|
|
||
| if sourceTypeTag == "" { |
There was a problem hiding this comment.
The source_type is always set. Why is this check needed?
There was a problem hiding this comment.
| Expect(err).To(HaveOccurred()) | ||
| }) | ||
|
|
||
| Context("when source_type tag is missing", func() { |
There was a problem hiding this comment.
Can a source type be missing for log envelopes?
Please read the Loggregator API spec for Log Envelopes.
There was a problem hiding this comment.
AFAIU envelope v1 defines it as optional, see https://github.com/cloudfoundry/dropsonde-protocol/tree/master/events#logmessage . Also in v2 envelopes source_type is merely a tag and therefore optional (also in protobuf file). I therefore think it is a good idea to handle this case even though I understand in practice this (currently) does not happen.
| for _, sourceType := range sourceTypes { | ||
| sourceType = strings.TrimSpace(sourceType) | ||
| t, _ := syslog.ParseSourceType(sourceType) | ||
| set.Add(t) |
There was a problem hiding this comment.
What if a source type is invalid? Shouldn't that throw an error and mark the drain as invalid?
There was a problem hiding this comment.
Yes. This is implemented in filtered_binding_fetcher's getUnknownSourceTypes(). It might make sense to write common function that is called in both filtered_binding_fetcher and binding_config .
| } | ||
|
|
||
| func (d *DrainParamParser) getLogFilter(u *url.URL) *syslog.LogFilter { | ||
| includeSourceTypes := u.Query().Get("include-source-types") |
There was a problem hiding this comment.
IMO it would be better if we add log in the names of the parameters, so that they have clear descriptive names. Maybe something like include-log-source-types and exclude-log-source-types will be better...
There was a problem hiding this comment.
I did so. No strong opinion from my side. However, I personally do not like this makes the query params quite long.
Are you sure we should make it that long?
| }, | ||
| { | ||
| name: "include-source-types=app", | ||
| url: "https://test.org/drain?include-source-types=app", |
There was a problem hiding this comment.
it should be types=APP. The urls are case sensitive. As the values of the source types are defined uppercase, I'm not sure if lower case should be accepted for the values.
There was a problem hiding this comment.
|
Thanks for the review. I replied to a bunch of comments for some clarifying remarks. Also a quick discussion about uppercase/lowercase config values would help me. |
I noticed that as my changes depend on `RawQuery` not being empty, but actually they were.
Also rename remaining instances of log type to source type
jorbaum
left a comment
There was a problem hiding this comment.
Addressed hopefully all comments.
I got a bunch of things to clarify. Biggest question that is left is likely if we should recommend users uppercase or lowercase values.
src/pkg/egress/syslog/source.go
Outdated
|
|
||
| // ParseSourceType parses a string into a SourceType value | ||
| func ParseSourceType(s string) (SourceType, bool) { | ||
| lt := SourceType(strings.ToUpper(s)) |
There was a problem hiding this comment.
Discussed with @chombium that we would like to be lenient here and normalize internally.
I am not so sure if we would like to suggest users and normalize internally to uppercase or lowercase here.
- Reason for lowercase as default: all other values for drain query params are lowercase, so we would follow this convention
- Reason for uppercase as default: source type is defined as uppercase in cloud foundry, so we would follow this convention
| } | ||
|
|
||
| func (d *DrainParamParser) getLogFilter(u *url.URL) *syslog.LogFilter { | ||
| includeSourceTypes := u.Query().Get("include-source-types") |
There was a problem hiding this comment.
I did so. No strong opinion from my side. However, I personally do not like this makes the query params quite long.
Are you sure we should make it that long?
| }, | ||
| { | ||
| name: "include-source-types=app", | ||
| url: "https://test.org/drain?include-source-types=app", |
There was a problem hiding this comment.
ef16bed to
ecab32a
Compare
Description
Addresses #711 :
?include-source-types=and?exclude-source-types(calling it source type instead of log type as log type is too generic and is already used in this code base)I still need to verify this, but I think a feature flag is not necessary as I tried to make string parsing more efficient by:
strings.IndexBytemight take some time, but it is used only with a single char/andstrings.IndexByte()should be more efficient thanstrings.Index()logTypePrefixesrather than in each log line, which should be more efficientRelated PRs:
Disclaimer: Claude helped me in crafting this code.
Type of change
Testing performed?
Checklist:
mainbranch, or relevant version branch