Your Web News in One Place

Help Webnuz

Referal links:

Sign up for GreenGeeks web hosting
February 15, 2021 03:45 pm GMT

World-first Static time RegEx engine with O(0) time complexity

What the heck is that?

  • RegEx engine written with static types?!
  • Code which evaluates RegEx "templates" in compile time so you know the result before you run your app?!
  • RegEx engine which works with O(0) run-time complexity?!
  • Minified 0-bite (GZip) length output?!
  • Fully bugged and not ready for production?!

I'm not kidding!!! This is not just a dream!

This is the first world RegEx engine written in pure Typescript types.

Check the working examples!

Alt Text

Alt Text

Alt Text

Alt Text

Github Repo - ts-generics-RegEx-engine

Disclaimer

  • The code is not ready to use in production environment.
  • Because of the stack limits of Typescript, some regExs stop working because they are too long and trigger recursion stack overflow known as Type instantiation is excessively deep and possibly infinite.
  • RegEx backtracking is not implemented yet.
  • The parser supports only a small subset of PCRE standard. Specifically .?*+()\\ symbols.

Motivation + usage

Thanks to new features of Typescript 4.1.x we are able to parse a string into a Tuple of tokens and much more! So I decided to write my own custom RegEx engine just by using Typescript static types to demonstrate how powerful the type system of Typescripts is.

How does the RegEx engine work under the hood?

As you may know, programming languages compilers + interpreters. You may know that they are pretty complex and includes Lexers, Parsers, Interpreters, and so on.

On the other side, this small engine is quite simple, so there are just 3 small modules:

  • 1. Tokenizer
  • 2. Parser
  • 3. Interpreter

1. Tokenizer

A small generic type TokenizeString<T> just parses RegEx template to tokens which are used as the input for 2. Parser to build RegEx Abstract-Syntax-Tree (AST).

Examples:

type T0 = TokenizeString<'\\(+(ab)+'>
Enter fullscreen mode Exit fullscreen mode

Alt Text

type T1 = TokenizeString<'\\(+(a(xy)+(xx)b)+'>
Enter fullscreen mode Exit fullscreen mode

Alt Text

2. Parser

type ParseRegExTokens<T> = ... takes the tokenized template and does the syntax analysis which produces an Abstract-Syntax-Tree (AST) Model of the RegEx template.

Examples:

type T3 = ParsedRegEx<TokenizeString<'\\(+(a(xy)+(xx)b)+'>>
Enter fullscreen mode Exit fullscreen mode

Alt Text

As you can see, the parser supports nesting of structures (like brackets in brackets in brackets etc...)

AST for '\\(+(a(xy)+(xx)b)+' template will look like this:

[{    type: 'element';    value: "(";    quantifier: 'exactlyOne';}, {    type: 'element';    value: "(";    quantifier: "zeroOrMore";}, {    type: 'groupElement';    states: [{        type: 'element';        value: "a";        quantifier: 'exactlyOne';    }, {        type: 'groupElement';        states: [{            type: 'element';            value: "x";            quantifier: 'exactlyOne';        }, {            type: 'element';            value: "y";            quantifier: 'exactlyOne';        }];        quantifier: 'exactlyOne';    }, {        ...; // and so on    }, {        ...; // and so on    }, {        ...; // and so on    }];    quantifier: 'exactlyOne';}]
Enter fullscreen mode Exit fullscreen mode

3. RegEx Interpreter

The last step is to create a proper "interpreter" type Test<RegExp, TestString> = ... which takes a template and a test string by applying rules from the RegEx AST.

Examples:

Alt Text

Alt Text

Alt Text

Alt Text

Alt Text

Alt Text

Alt Text

Alt Text

Alt Text

And that's it!

If you don't believe, you can check the full source code in this GitHub repo: https://github.com/Svehla/ts-generics-RegEx-engine

Wait... And what about the real Javascript output? Let's check it out!

Alt Text

Haha! A few hundreds line of static types and runtime output is empty with O(0) time complexity! That's the magic of Typescript

And what's next?

If you're interested in another advanced usage of the Typescript type system, you can check these step-by-step articles/tutorials on how to create some advanced Typescript generics.


Original Link: https://dev.to/svehla/world-first-static-time-regex-engine-with-o-0-time-complexity-4k4e

Share this article:    Share on Facebook
View Full Article

Dev To

An online community for sharing and discovering great ideas, having debates, and making friends

More About this Source Visit Dev To