ArcanOS PRE-MAGUS: frame allocator rudiments

It’s been entirely too long coming, but as of ArcanOS commit e1a75edd4b0a, the rudiments of the frame allocator are available.  Those who’ve followed the project will note that the memory manager was disabled and still hasn’t come back, but this will be coming along very soon.  Really, “memory manager” was a misnomer anyway, as the real goal was to have a basic malloc( )function.  Being able to summon up a pointer to an arbitrary chunk of memory could be considered “handy” for early development.  So, the original allocator is what I’ve been calling a “scratch pad allocator.”  Basically, it just assumes there’s a spare megabyte of RAM located just beyond the kernel’s load location and it uses this area for serving memory requests.  This actually was a pretty neat thing to have on hand in the early days of ArcanOS, before we were using paging and when it was more important to me to have a robust kernel debug log than it was to do anything else.  Of course, this had to be retired when ArcanOS switched over to paging, because the 1MB “scratch pad” shouldn’t be assumed to be mapped in.  No…the malloc( )function should necessarily rely on the paging system.

There are some distinct phases to go through in making a mechanism for allocating out the available RAM:

  1. Identify the physical addresses of all the RAM in system.
  2. Create a tracking system to know what RAM is free and which is used.
  3. Create a mechanism for mapping unused RAM into the address space of the process requesting it.

Step (1) was nicely done for us already by GRUB and the map is passed in as part of the multiboot info struct when GRUB dispatches to the kernel’s entry point.  To know where all the free RAM is in the system, you need only parse the struct.  Of course, to make use of the information in the struct, it’s handy to also tackle step (2), so that you have somewhere to record this information about available memory.  Since I think it’s probably the most minimal data structure possible (and, therefore, the easiest to read, learn, and explain…and thus in line with a goal of ArcanOS), I have opted for the ever-humble bitmap.  ArcanOS has an intentional design decision to target a very generic 32-bit x86 architecture and thus limits its address space to 4GB.  This is broken down into page-size chunks and each one gets a bit in the bitmap.  A 1 in the bitmap denotes free RAM, otherwise it should not be touched.

I set up the bitmap in memmgr.c in multiple passes.  In the first pass, I set the bitmap so that all the page frames are invalidated:

void frame_allocator_init(multiboot_info_t* mbi, uint32_t kernel_base, uint32_t kernel_end) {
	int i;
	int frames_allocated = 0;
	for(i=0; i<(PHYS_MEM_MAP_SIZE); i++) {
		phys_mem_map[i] = 0;

Next, I iterate through the memory map given by GRUB and mark all the full and available 4KB frames as being available:

   multiboot_memory_map_t* mmap;
   for (mmap = (multiboot_memory_map_t *) (mbi->mmap_addr);
		(unsigned long) mmap < (mbi->mmap_addr) + mbi->mmap_length;
		mmap = (multiboot_memory_map_t *) ((unsigned long) mmap
								 + mmap->size + sizeof (mmap->size)))
	 //_kern_print(" size = 0x%x, base_addr = 0x%x%x,"
		//	 " length = 0x%x%x, type = 0x%x\n",
			// (unsigned) mmap->size,
		//	 mmap->addr_high,
	//		 mmap->addr_low,
		//	 mmap->len_high,
	//		 mmap->len_low,
		//	 (unsigned) mmap->type);
	if (mmap->type == MULTIBOOT_MEMORY_AVAILABLE) {
		 //Kindly note here that we will *NOT* be making use of the "high"
		 //fields here.  The memory manager is current capped at 4GB, so
		 //we will deal only with the lower 32-bits of addresses.
		 //The goal here is to flip on the bits for available RAM.  After
		 //that, we will flip OFF the bits for the areas where the kernel
		 //was loaded and where the initial page tables sit.
		 //The frame allocator is very happy to fail to mark partial
		 //frames as "not available."  This is a good, safe starting point
		 //as it means that the only physical memory allocated out will
		 //"really be there."
		 uint32_t base = mmap->addr_low;
		 uint32_t len = mmap->len_low;
	     //First, advance up to the next page frame boundary.
		 if ((base % PAGE_SIZE) != 0) {
			 base = base + (PAGE_SIZE - (base % PAGE_SIZE));
			 len = len - (PAGE_SIZE - (base % PAGE_SIZE));
	     //As long as there is another frame to allocate, allocate it.
		 while (!(len<PAGE_SIZE)) {
			 mark_frame(base, FRAME_STATUS_FREE);
			 base += PAGE_SIZE;
			 len -= PAGE_SIZE;

And then finally I make sure that the space taken up by the kernel and the initial page directory and page tables is marked as unavailable.  This is important to remember to do…the space you’ve used before turning on your memory manager is, by definition, not available to use.  Otherwise, you’ll accidentally write over your page tables and other stuff you’ve so laboriously set up:

   //Now, time to make sure the page directory and any existing page
   //tables are never again re-allocated.
   uint32_t* page_entry = (uint32_t*)INITIAL_PDE;
   for (i=0; i<1024; i++) {
		uint32_t pde_entry = page_entry[i];
		pde_entry &= 0xFFFFF000; //Page table address is in the upper 20 bits
		if (pde_entry != 0) {
			_kern_print("Located existing page at 0x%x and will mark it as used\n", pde_entry);
			mark_frame(pde_entry, FRAME_STATUS_USED);
   //Finally, mark off the area where the kernel is loaded
   kernel_base &= 0xFFFFF000; //Find the nearest page (rounding down)
   while(kernel_base < kernel_end) {
	   mark_frame(kernel_base, FRAME_STATUS_USED);
	   kernel_base += PAGE_SIZE;

And, presto.  It’s a bitmap of memory.  The process of feeding free frames of memory back into the allocator is now a process of finding a 1 in the bitmap, calculating its physical address, and then updating the page directory and tables appropriately.  When I write that code, I’ll break it down on here.  I want to add that there are some obvious drawbacks to the use of a bitmap in this fashion.  The first is that the bitmap is going to be pretty sparse, meaning that space is dedicating to bits that will never actually be used.  The bitmap is also very minimal in what it can do– it’s a map of available RAM, but if a frame is marked as 0, there’s no way to tell why.  I also suspect that the process of returning frames to available RAM is not going to be as simple as it could be, and I further suspect that requesting large data extents (such as 4KB for making a new page table) may not go in a perfectly optimal fashion.  All of these, however, are hits I’m willing to take in exchange for the fact that it’s a simple, easy to code, and easy to explain data structure.  This shows how to take the information in a GRUB multiboot info struct and use it to build a map of the available page frames in memory, and being able to show this in a single blog post isn’t so bad.

So, the next step is to get the allocator allocating, which is to say the next step is to take frames out of the bitmap and map their physical addresses to available addresses in the kernel’s address space.

The Bootloader Two Step (aka The x86 Blues)

I knew it would happen early in the project, but I didn’t know it would happen this early in the project.  Incidentally, as I work on the lowest-level parts of ArcanOS, I become increasingly amazed that x86 became the dominant operating system that it did.

The short version for those of you who don’t want to know the gory details– basically your PC is still capable of booting MS-DOS 3.3 off a floppy if you needed it to.  Think about a PC circa 1990 or so.  Think about one today.  Note the difference.  Yet your PC still carries around compliance with standards for that PC back in 1990.  Probably earlier.  Turning it into a modern computer requires enabling and disabling various hardware features in a process that resembles tap dancing on an icy pond in hell.  I’ve just stepped on another crack in the ice, and I’m nerd-raging about it on my blog.  There.  Enjoy your coffee.

So, what could I possibly be talking about?  Well, I’ve been putting work in on the real memory manager for ArcanOS.  Everything in ArcanOS FOOL is basically some Mickey Mouse stuff to just get it booting.  Now I’m sitting down to do it for real, part by part, starting with memory management.  The first step, as you might imagine, is to find out how much physical RAM you actually have so that you know where you can start placing some large memory management data structures like the page table and page directory.  So, how do you find out about the physical RAM?  It’s not like it’s stuffed in a single block of memory addresses.  No.  It’s in bits and pieces all over the place, and to get a reliable map of where it all is, you ask the BIOS to provide it.  So, that shouldn’t be a problem, right?

Well, it is.  Why?  Because the BIOS is available only when you’re in 16-bit “real” mode, and ArcanOS, like most operating systems, uses 32-bit “protected” mode, and one of the first things the bootloader does is get into 32-bit mode so that the OS can get on with its life.  So, after locating some sample code for querying for the memory map, I put it in the bootloader before the switch to protected mode.  “No problem,” I think.  “I’ll stuff the memory map in a guaranteed region of memory and it’ll be waiting for me when the kernel loads.”  I find some sample code, convert it to AT&T syntax, and jam it into the bootloader.  Easy peasy, right?

Nope!  And, why would that be, Bobby?  Again, your PC, at boot time, is stuck in the 1980s.  Booting is a fairly straightforward process for your PC.  It basically looks at your hard drive and loads the first 512-byte sector off of it into a spot in memory and lets it run.  It’s very important that I note that this means the bootloader must be smaller than 512 bytes (and some of those bytes have to be special signatures to prove it’s actually bootable).  ArcanOS FOOL can enable high memory, switch to protected mode, and load and execute the ArcanOS kernel in just under 512 bytes.  Adding the code to get the memory map threw it over its quota by about 20 bytes.

So, where does this leave me?  Really, it leaves me the same place as most people who’ve walked down this road.  I have to make a two-stage boot loader.  Basically, this means that ArcanOS grows from two independent programs (bootloader and kernel) to three.  The first is a bootloader that does nothing but load the second bootloader.  That bootloader can be as big as it wants because it won’t be subject to the single sector limitation.  The first stage loads the second one, and the second one does all the real work.  This does change the ArcanOS build process as well as its disk layout.  In reality, I should now spend those 512 bytes doing something useful like reading the hard drive partition table and locating the second stage bootloader that way.  Ultimately, this will form the rudiments of the ArcanOS file system, since the second stage bootloader is often stashed at a region set aside by the file system.  That means I have to start making tools to provide the filesystem structure and disk layout, whereas the hard disk image for ArcanOS right now is something closer to a floppy.

All because I wanted to know how much memory the computer has and where the memory is kept.  Again, this is the dominant hardware architecture of PCs, laptops, and netbooks.

There is an alternative which I’m considering, and that’s to learn to use GRUB.  GRUB is designed to provide flexible and rich boot features for any OS project.  GRUB will cheerfully assemble a memory map for me, it’ll make pretty boot menus, and I think it might also correctly foam a latte.  And, in fact, GRUB compatibility is absolutely on my mental list for ArcanOS PRIESTESS but I’m doing my best to keep it out of ArcanOS MAGUS.  Why?  Because the primary goal of ArcanOS is to teach.  I need to go through the challenge of a two-stage boot loader and some physical memory detection first, and then, once I know what’s really going on down there, I’ll bring ArcanOS up to GRUB multiboot compliance and make use of its fun extra features.

This also, incidentally, demonstrates why there is a gap between the “Hello World!” bootloader tutorial and the “How to use GRUB” tutorial.  It’s because doing anything meaningful with a bootloader requires a two-stage bootloader and is complex and headache-inducing…definitely not the sort of thing that’s quite so easily explained in a quick blog article.  Hopefully, this audit trail of my struggles will help improve that, even though I will be doing like a good open source coder and, after ArcanOS MAGUS, leave behind some of the irritations of x86 hardware and boot and let GRUB do the heavy lifting.

ArcanOS FOOL released

It may have taken a little longer than I would have initially liked, but I’m pleased to announce that, as of this evening, ArcanOS has had its first milestone release — FOOL.  As I have mentioned in the ArcanOS wiki on Google Code, I think version numbers are pretty arbitrary, so I have instead opted to make milestone releases and name them after the trump cards in a tarot deck.  The goals for FOOL are pretty apt for its name.  In a tarot deck, “the fool” is often depicted as a young man blissfully stepping off a cliff.  The goals for this release were to basically lay the groundwork for getting to the “real work.”  This meant having a coherent build process and test/debugging platform and developing a bootloader, rudimentary kernel, basic debugging tools (like logging), exception handlers, and at least one interrupt handler.  Having all these basic pieces in place means that I have something stable I can use for future development.

Currently, ArcanOS is little more than some initialization code and a spinlock.  There is a basic handler for the keyboard interrupt, but that’s it.  There’s very little to try out, as there’s no user-space and thus no programs written for it.  Still, I’m pretty pleased with it.  It’s the first personal project I’ve taken on since 2001, and most personal open-source projects don’t even make it to their first milestone release.

The next release will be ArcanOS MAGUS.  I’m currently scoping out my goals for it.  I know I want to move the memory model to a flat, paged model.  This will affect multiple parts of the code base because it’ll mean that I will be able to stop adjusting for the segmentation offset whenever I need a linear or physical address.  I suspect this alone will make paging the major effort, but I want to do some other things that will keep moving things forward.  After all, having an environment for user processes is the point at which things get really fun.

As I mention in the ArcanOS notes, I owe a debt of gratitude to Dr. Lorenzo Alvisi as well as the xv6 project over at MIT.  xv6, in an earlier iteration, was known as JOS, which Dr. Alvisi used to produce a series of operating systems lab exercises for his operating system courses.  ArcanOS started out as one of those lab exercises, and though it has changed very significantly already, it still bears much of the original bootloader code.

ArcanOS FOOL Feature Complete

I am pleased to announce that the current push of ArcanOS represents a completion of features for release FOOL.  The feature list won’t look very big, but remember that this is a labor of love being done in my (not very ample) spare time, and that most homebrew OS projects never even make it this far.  The list of features slated for the FOOL release are:

  • Bootloader which reads the kernel from disk
  • Protected mode operation
  • Flat memory model achieved through segmentation only
  • Identification of the major kernel’s binary sections in memory
  • Scratch space memory allocator
  • Debugging output via a console print
  • Hardware exception handlers
  • Keyboard interrupt handler

I’m opting to call feature complete at this point because ArcanOS FOOL represents a small laundry list of features necessary to build the next set of kernel features.  That is…FOOL is a kernel that doesn’t just fall completely on its face.  Issues of booting have been worked out, the memory model is understood (if primitive), the exception handlers are in place so errors don’t cause a spontaneous reset, and the keyboard interrupt handler demonstrates that the PIC is remapped and that code for creating and registering interrupt service routines works.  I don’t want to go any further on feature development at this point because I want to enable paging and use it to enforce the flat memory model, and switching to paging will mean having to deal with memory-mapped hardware (like the console) differently.  A smaller codebase is easier to convert than a larger one.

ArcanOS is going to go through a cleanup and documentation phase now.  I want to thoroughly scrub all unnecessary code from the codebase and to fully get ArcanOS off its JOS heritage so that it’ll be a consistent project moving forward.  I’ll then carve off a release and post a build so people can play with it using Bochs, and then I’ll write Wiki pages to help document my major headaches.  Then, it’ll be time to start looking forward to release MAGUS.

ArcanOS pre-FOOL: Boot Summary and Heap Allocator

Last night, I did a push to the code repository for ArcanOS which represents the last of the work needed for its first heap allocator.  I inherited the majority of my bootloader code (please see the project website for my acknowledgements), so I decided that my first milestone for ArcanOS wouldn’t be “it boots and can run C code.”  That’s an amazing luxury, I might add.  The x86 boot process is rather arcane, as is learning how to read from a hard drive.  In short, the bootloader has to do a few things.  First, it has to set up a stack.  Then, it has to enable the A20 line so you can access some real memory.  Then, it has to set up some sort of segmentation as a prelude to switching to protected mode (so that 32-bit code can be used, among other things).  After all of that, being in 32-bit mode and armed with a stack, it can run the rest of itself in fairly standard gcc C, and you can thus more easily do the final task– loading and dispatching a bigger binary to do more meaningful work.

ArcanOS currently has no file system, so the kernel is just stuffed on the sector after the boot sector.  The kernel is an ELF binary, and the headers for ELF give the necessary information about where to load the kernel and how to jump to it.  The entry point to the kernel is actually in assembly because the first thing the kernel does is re-do memory segmentation so that the kernel loads into the higher memory addresses.  The boot loader actually loads the kernel at roughly 1MB, but the kernel is linked to be loaded at address 0xf0000000 + 1MB, so the first thing that’s done is that a new descriptor table is loaded to map all the low memory to high addresses.  This actually moves the video memory buffers up into the higher addresses as well, which is why the ArcanOS “console driver” must offset its writes by KERNBASE.

I’m going to skip over my experiences writing the “console driver” for now.  Suffice it to say, when you are implementing your own kernel, you quickly realize how little you have to work with.  I was once told that, ultimately, there are only two debugging tools — cleverness and printf( ).  I actually repeat that to rookies on my team, because, generally speaking, it’s true.  Well, guess what?  On day 1 of your kernel, you don’t have printf( ), and you have to go make one for yourself.  That starts with a functional screen console of some kind, and some bits of library code, such as itoa( ), that you’ve likely taken for granted your whole life.  So, a fairly considerable amount of time went into just getting some basic screen printing for debugging purposes.  I have no doubt that the “console driver” is, ultimately, throw-away code in favor of a far more robust console system, but necessity dictates.

The other thing I wanted to go ahead and get out of the way was a kernel heap allocator.  If there’s another function we all take for granted, it’s malloc( ).  There is no malloc( ) in your kernel if you don’t write it yourself.  This itself was an interesting journey, because the first challenge is finding a place to start the heap.  Placing the heap wrong might mean overwriting the kernel or running into memory dedicated to hardware I/O.  Fortunately, there isn’t much hardware to dodge once you’re above the 1MB offset, so the question is how to find where the kernel ends.  The way I’ve solved this is by extending the linker script called “kernel.ld”.  The PROVIDE( ) function in linker script language is particularly useful, and it lets you store information gathered in the linking process into C variables which you declare “extern”.  You can see the code which “receives” this information in kern/init.c.

What about the allocator itself?  Well, I decided that determining the amount of RAM on the target platform was a fish too big to fry right away.  RAM detection requires that I either give up my boot loader in favor of GRUB or that I add RAM detection into my boot loader before I go into protected mode (since it requires BIOS calls).  Both are non-trivial, and I also don’t plan on doing any massive allocations any time soon.  Given this, the heap just naively assumes it’s got 1MB of space in which to work.  I’ll correct this in future releases.

The allocation algorithm is incredibly simple.  The allocator will check to see if it’s got a free chunk that’s already the right size and will return that if it’s available.  Otherwise, the last member of the free list is always the “frontier” of never-before-allocated space, and it will carve off a new hunk from it.  The algorithm for freeing space it so simply move the chunk from the “allocated list” into the “free list” for future use.  There’s no provision for coalescing chunks together in the free list or anything like that, and I have some reasons for not doing this.  The first reason is because, frankly, I expect that I will rewrite this heap allocator when I have RAM detection and other goodies of this nature available to me.  The other major reason is because this allocation scheme is, in reality, enough to keep development moving along for now.  ArcanOS currently just services itself, and its allocation needs are small, and the 1MB space is likely to be plenty for a while.  Moreover, I expect most of my allocation needs will be a handful of objects of regular size.  If I were allocating lots of things of different sizes, this would eat into memory pretty badly, but if I’m just allocating structs of the same type over and over again, this system works pretty well.  Finally, a more complex algorithm means more chances for bugs…at a time when my available debugging options aren’t huge.  It’s unwise to invite extra bugs for functionality you don’t need when you expect you’re going to rewrite it anyway.

A crummy malloc( ) is, by the way, WAY better than no malloc( ) at all.

And now, armed with console out and a heap allocator, it’s time for me to attack the last feature for the FOOL milestone — writing ISRs and turning on interrupts.