Jump to content
Mike Torrettinni

Compile code on the fly...?

Recommended Posts

I read this explanation why Everything tool is so fast in indexing and searching filenames: "Searches are compiled into byte code and executed."

Here: https://www.voidtools.com/forum/viewtopic.php?f=2&t=9463

 

Is this same as compiling on the fly and executing the code, like this question: https://stackoverflow.com/questions/16318136/can-i-compile-the-code-on-the-fly-in-delphi-and-execute-it/16319062 ?

 

My understanding is that all code is compiled and executed in runtime, we just add some parameters, values from UI and other input.

 

Does anybody have any simpler explanation, or what is meant as 'compiled into byte code and executed'? Is that some new trick to optimize, speed-up code execution?

 

Thanks!

 

Share this post


Link to post
Guest
1 hour ago, Mike Torrettinni said:

Does anybody have any simpler explanation, or what is meant as 'compiled into byte code and executed'?

I will try to explain this as simple as i can think in the available time.

 

Lets imagine that functionality of search, when you have all the data in memory and you need to perform a complex find/search on it, by complex i mean multi entries with multi criteria's, like the file name should start with capital SE and have an extension type ends with t with last accessed date not after 1/1/2020 but created on May 2018, ... and so forth many things to compare, now how to it is simple and pretty forward, but will include many functions to compare each condition and then move to the second, now hoe to squeeze speed out of such algorithm ?

as example : you will have such function

Quote

function FindByRulesCompination(A:TNameRule; B:TDateRule; C:TSizeRule....

which in turn will call such functions

Quote

function TestForRuleA(A:TNameRule):boolean;

function TestForRuleB(B:TDateRule):boolean;

function TestForRuleC(C:TSizeRule):boolean;

....

and each one of them might call another few functions, so in low level assembly code the jumping between these functions is very small almost negligible, but repeat it for few millions times and you will have an amplified effect.

 

Now how to optimize such code, by building at runtime a function that will not jump and will test all the criteria's in place in one small loop, the smallest we can build, means the application will generate byte code in case with Windows OS it will generate assembly function that will do the same job as all these functions, the result function will be something like this ( just keep in mind that following is not even close but a simple way to explain the idea)

Quote

function VirtualTestBuiltAtRuntime : Boolean;

begin
  for i := 0 to lastFile do
  begin
   if Files.Size < MinimumSize then  //  MinimumSize here is const !
     continue; 
   if Files.LastAccessed >= LastAccessedAfterDateTime then    //  LastAccessedAfterDateTime  is const !

     continue; 

     ......
  end;

end;

You may not be familiar with what the impact of using const in that case but the return is in half cycle or one full cycle magnitude, but in some case it might be saving the CPU cache usage by minimizing any memory read for those consts, even for strings it will be replaced by comparing bytes from the memory with fixed CPU words values with masks, loaded and compared directly into assembly code, the whole thing should be comparing data from one place against constants build in the assembly instructions ( byte code), also no read from stack as there will not be any local vars.

1 hour ago, Mike Torrettinni said:

Is that some new trick to optimize, speed-up code execution?

No, not new at all, some PLC is utilizing that for many years, i remember i saw code for Z80 belongs to the 80s era doing such tricks just because that CPU was 3.58MHz.
Is that something you want to try?
No, that way above your skills for now, it require whole different skills from your programming knowledge, and will consume time beyond imagination, because you are literally building mini compiler, with specific algorithm targeting speed and efficiency.

Hope that clear it for you. 

Share this post


Link to post
Guest

Another example came to my mind and i found it amazing, but it is not to waste your time on using it.

 

Let say i want to search for a file name that starts with "Mike", in windows case is ignored for file names, so are we going to choose a case and convert the whole DB then loop over it, but this might be a problem if the DB is so big, and we left to face two options convert the DB itself and lose the original name case, or duplicate the case and stress the RAM usage, but there a third option, to do specified built search algorithm to ignore the casing.

SSE can do the following with ease and they are built for such case, here i will show you that approach in an example with pseudo code

const MUP = 'MIKE';

const MDOWN = 'mike';

load 4 chars from the DB when the file name length is 4 and put it in R, R is a CPU register, those are 64bit or in case of ansi then it will be 32bit, here it is irrelevant, 

A = R-MUP , we do byte or word subtraction here

B = R-MDOWN , like above

so if R is "mIKe'

then A value will be something like this qq0000qq for 32bit simplicity, qq is a value but not zero

B value is 00qqqq00 , 

When R is something like "MikT" then 

A will be qq0000qq

b = 00qqqqqq and the result of multiplication will be 000000qq .

Can you see what happened here, we simply go and do a multiplication byte wise this is one instruction in SSE and the result will be 00000000 means the name is identical with case difference, we did that over 4 chars with 3 instructions only, MMX also give nice set of instructions.

 

Share this post


Link to post
13 minutes ago, Kas Ob. said:

You may not be familiar with what the impact of using const in that case but the return is in half cycle or one full cycle magnitude, but in some case it might be saving the CPU cache usage by minimizing any memory read for those consts, even for strings it will be replaced by comparing bytes from the memory with fixed CPU words values with masks, loaded and compared directly into assembly code, the whole thing should be comparing data from one place against constants build in the assembly instructions ( byte code), also no read from stack as there will not be any local vars.

 

13 minutes ago, Kas Ob. said:

s that something you want to try?
No, that way above your skills for now, it require whole different skills from your programming knowledge, and will consume time beyond imagination, because you are literally building mini compiler, with specific algorithm targeting speed and efficiency.

Very interesting! Thanks for detailed explanation and I agree, definitely above my current skill set, and I probably will never need this.

 

 

1 hour ago, Anders Melander said:

Thank you, but that was very confusing to read, the whole page. I was never good in just theory, I need examples - well, in this case of topic I doubt example would help me, anyway.

And if you look at the history of this wiki page, so many many corrections and additions over the years (800+ edits, 463+ editors since 2002), for a topic that mostly has 1 or 2 paragraphs per term, you can realize this was a long journey to create. How can this be for rookies like myself? 🙂

Share this post


Link to post
48 minutes ago, Mike Torrettinni said:

Thank you, but that was very confusing to read, the whole page.

Okay. I'll try with a simpler example (apologies to @Kas Ob. if he/she/they already covered it, but it's friday and TLDR).

 

Byte code is just the instruction set for a specialized virtual CPU so in this case it's a virtual CPU that only knows how to search stuff.

 

A byte code compiler is a compiler that transforms "something" (in this case search parameters) into the virtual CPU assembler.

The implementation of the virtual CPU is called an interpreter if it executes the byte code instruction from scratch each time you execute the "byte code program". If it instead compiles the byte code into actual processor opcodes the first time and then execute those the first and subsequent times, then it's called just-in-time compiler.

 

One of the first pascal implementation was UCSD Pascal which ran on the UCSD p-System - a byte code virtual machine.

 

P.S. You really should try to read about and understand these old concepts. While they might not seem relevant to you now, understanding and knowing about them can only improve your skills.

wall_o__text.thumb.jpg.4e975a4a7c65d1b489e7070bd3f73801.jpg

  • Like 1
  • Thanks 1

Share this post


Link to post

IIRC SQLite, Lua-JIT, the new PHP, all achieved good performance by bytecode compiler/executor (JIT).

  • Like 1

Share this post


Link to post
On 11/6/2020 at 3:59 PM, Mike Torrettinni said:

I read this explanation why Everything tool is so fast in indexing and searching filenames: "Searches are compiled into byte code and executed."

They just rewrite a regexp-like engine, which are usually using byte codes.
Note that when a regexp uses JIT, it is faster, but not in all cases.
For pascal source code of a very fast regexp engine, check https://github.com/BeRo1985/flre - you will see it is complex to make such things fast!

 

But to be fair, my guess is that Everything as a tool is fast not because its pattern search is fast, but because all the file names are already cached in memory. So it doesn't have to call the Windows API to get all the file names.

The real bottleneck is not the search itself, but the OS API calls.

Byte-code search in the list may result in the search being a bit faster, but for only a few percent.

So this performance statement is more like a marketing statement - not false, but not the real performance trick.

 

As  another reference code, perhaps easier to read than FLRE, check our Open Source TMatch object in https://github.com/synopse/mORMot2/blob/master/src/core/mormot.core.search.pas

It is very fast e.g. to find e.g. '*.pas' or 'toto*.pas' or '*inside*'. When I mean fast, it is fast (using algorithms like Holub and Durian (2005) SBNDM2 algorithm).

And if you want to see some code about an expression search engine, using an AST and some kind of byte codes, check the TExprParserMatch class in the same unit.

 

Edited by Arnaud Bouchez
  • Like 1
  • Thanks 1

Share this post


Link to post

 

55 minutes ago, Arnaud Bouchez said:

So this performance statement is more like a marketing statement - not false, but not the real performance trick.

Yes, probably the tool does a lot of things to make it fast (apart from the obvious accessing and creating big index files). It's been around for many years, so I assume it's using a lot of other performance tricks, along with bytecode searches, and they all add a little here and there.

 

Thanks for another view on this and examples.

Share this post


Link to post

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×