Your Web News in One Place

Help Webnuz

Referal links:

Sign up for GreenGeeks web hosting
July 13, 2021 02:33 am GMT

Writing an Assembler in Rust, and How I'm Redoing the Lexer

I've continued work on the assembler I've been working on. I finished the Lexer, got it to compile, ran it, tested it, and found that the lexer just did not work. Luckily, I found a crate called logos that helps you make a fast lexer, which I'm using in order to re-do the lexer.

GitHub logo AshtonSnapp / chasm

The Official Cellia Cross-Assembler for Modern Computers

chasm

The Official Cellia Cross-Assembler for Modern Computers

Building chasm

Clone this repository to your local machine, cd into the chasm directory, and run cargo build. Simple!




As of right now, I am writing some callback functions. Specifically, I'm writing the one responsible for handling character immediates (immediate in the sense that the processor doesn't have to fetch an address then fetch the value from that address, it just has to fetch the value). This is being done via a giant match statement. Here's basically what the code for this function looks like right now:

fn char(lex: &mut Lexer<Token>) -> Result<u8, ()> {    let slice: &str = lex.slice();    let poss_char: &str = &slice[slice.len() - 2];    // Welcome to hell.    match poss_char {        "\x00" => Ok(0),        "\x01" => Ok(1),        "\x02" => Ok(2),        "\x03" => Ok(3),        "\x04" => Ok(4),        "\x05" => Ok(5),        "\x06" => Ok(6),        "\x07" => Ok(7),        "\x08" => Ok(8),        "\x09" => Ok(9),        "\x0A" => Ok(10),        "\x0B" => Ok(11),        "\x0C" => Ok(12),        "\x0D" => Ok(13),        "\x0E" => Ok(14),        "\x0F" => Ok(15),        "\x10" => Ok(16),        "\x11" => Ok(17),        "\x12" => Ok(18),        "\x13" => Ok(19),        "\x14" => Ok(20),        "\x15" => Ok(21),        "\x16" => Ok(22),        "\x17" => Ok(23),        "\x18" => Ok(24),        "\x19" => Ok(25),        "\x1A" => Ok(26),        "\x1B" => Ok(27),        "\x1C" => Ok(28),        "\x1D" => Ok(29),        "\x1E" => Ok(30),        "\x1F" => Ok(31),        "\x20" => Ok(32),        "\x21" => Ok(33),        "\x22" => Ok(34),        "\x23" => Ok(35),        "\x24" => Ok(36),        "\x25" => Ok(37),        "\x26" => Ok(38),        "\x27" => Ok(39),        "\x28" => Ok(40),        "\x29" => Ok(41),        "\x2A" => Ok(42),        "\x2B" => Ok(43),        "\x2C" => Ok(44),        "\x2D" => Ok(45),        "\x2E" => Ok(46),        "\x2F" => Ok(47),        "\x30" => Ok(48),        "\x31" => Ok(49),        "\x32" => Ok(50),        "\x33" => Ok(51),        "\x34" => Ok(52),        "\x35" => Ok(53),        "\x36" => Ok(54),        "\x37" => Ok(55),        "\x38" => Ok(56),        "\x39" => Ok(57),        "\x3A" => Ok(58),        "\x3B" => Ok(59),        "\x3C" => Ok(60),        "\x3D" => Ok(61),        "\x3E" => Ok(62),        "\x3F" => Ok(63),        "\x40" => Ok(64),        "\x41" => Ok(65),        "\x42" => Ok(66),        "\x43" => Ok(67),        "\x44" => Ok(68),        "\x45" => Ok(69),        "\x46" => Ok(70),        "\x47" => Ok(71),        "\x48" => Ok(72),        "\x49" => Ok(73),        "\x4A" => Ok(74),        "\x4B" => Ok(75),        "\x4C" => Ok(76),        "\x4D" => Ok(77),        "\x4E" => Ok(78),        "\x4F" => Ok(79),        "\x50" => Ok(80),        "\x51" => Ok(81),        "\x52" => Ok(82),        "\x53" => Ok(83),        "\x54" => Ok(84),        "\x55" => Ok(85),        "\x56" => Ok(86),        "\x57" => Ok(87),        "\x58" => Ok(88),        "\x59" => Ok(89),        "\x5A" => Ok(90),        "\x5B" => Ok(91),        "\x5C" => Ok(92),        "\x5D" => Ok(93),        "\x5E" => Ok(94),        "\x5F" => Ok(95),        "\x60" => Ok(96),        "\x61" => Ok(97),        "\x62" => Ok(98),        "\x63" => Ok(99),        "\x64" => Ok(100),        "\x65" => Ok(101),        "\x66" => Ok(102),        "\x67" => Ok(103),        "\x68" => Ok(104),        "\x69" => Ok(105),        "\x6A" => Ok(106),        "\x6B" => Ok(107),        "\x6C" => Ok(108),        "\x6D" => Ok(109),        "\x6E" => Ok(110),        "\x6F" => Ok(111),        "\x70" => Ok(112),        "\x71" => Ok(113),        "\x72" => Ok(114),        "\x73" => Ok(115),        "\x74" => Ok(116),        "\x75" => Ok(117),        "\x76" => Ok(118),        "\x77" => Ok(119),        "\x78" => Ok(120),        "\x79" => Ok(121),        "\x7A" => Ok(122),        "\x7B" => Ok(123),        "\x7C" => Ok(124),        "\x7D" => Ok(125),        "\x7E" => Ok(126),        "\x7F" => Ok(127),        _ => Err(())    }}

Yes. I had to write all of that. Because I can't really guarantee that whoever's using the assembler has Unicode support in their program. That whole function was painful to write. At least now, all I have to write in terms of callback functions are the ones for character escape sequences, strings, addresses, immediates, identifiers (labels and symbols), and actual instruction mnemonics. (Also, need to stop trying to Ctrl+S while using a browser)


Original Link: https://dev.to/ashtonsnapp/writing-an-assembler-in-rust-and-how-i-m-redoing-the-lexer-51eb

Share this article:    Share on Facebook
View Full Article

Dev To

An online community for sharing and discovering great ideas, having debates, and making friends

More About this Source Visit Dev To