Skip to content

Conversation

@somecookie
Copy link
Contributor

No description provided.

@somecookie somecookie requested a review from SiiiTschiii May 11, 2021 14:33

### Comparison

We focus here on the file `application-pdf.pdf` as it is detected by all compiled magic files. The results are similar in this scenario for the other files. There average time of the benchmarks for each magic file is:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Focus is a bit confusing I find as you show the results for many more mime-types. Maybe say that the comparison of the average time it takes to detect application/pdf is shown for the three different sizes of magic DBs.


To assess the performance improvement gained by using a minimal magic file, we created the benchmark
`main.c`. This benchmark measures the time needed to find the MIME type of a file with
a given compiled magic file.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mention here already the 14.45% gain between full mime db and mime db covering only X mimetypes.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you also update the main README.md.

  • Add a link to this section
  • Mention the performance improvements of 14.45% in the first paragraph (as it basically shows the value of this project).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have a hypothesis why we see an improvement of 14.45% here but a 30% performance improvement for overall squid performance?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it's worth to mention in the main README.md or by adding one to /benchmars/README.md what /e2e vs /libmagicvs /mod is about.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have a hypothesis why we see an improvement of 14.45% here but a 30% performance improvement for overall squid performance?

If you look at the results, they vary a lot, it would probably be better to do some kind of average difference rather than just one file.

return sqrt(var(measures, length));
}

int main(int argc, char *argv[])

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool stuff, now you programmed some perl, bash, c and go to come!

Ricardo Ferreira Ribeiro and others added 3 commits May 19, 2021 10:39
@somecookie somecookie requested a review from SiiiTschiii May 19, 2021 08:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

3 participants