Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Starting point for porting to a 32-bit architecture? #2256

Open
zap8600 opened this issue Mar 7, 2025 · 38 comments
Open

Starting point for porting to a 32-bit architecture? #2256

zap8600 opened this issue Mar 7, 2025 · 38 comments

Comments

@zap8600
Copy link

zap8600 commented Mar 7, 2025

Hello! I was wondering if anyone had any tips/pointers on where to start porting ELKS to a new architecture. I love this port of Linux and its very minimal resource usage, and I would love to use it on a regular for development. However, since it is currently only for IA16, I would have to use an emulator to do so. Any information would be greatly appreciated. Thank you in advance!

@ghaerr
Copy link
Owner

ghaerr commented Mar 7, 2025

@ghaerr
Copy link
Owner

ghaerr commented Mar 7, 2025

Sounds like you're thinking about a non-IA16 CPU? What are you thinking of? Porting to a non-segmented or non-16-bit CPU architecture probably won't work out, there's just too much code written around both.

@zap8600
Copy link
Author

zap8600 commented Mar 7, 2025

I was thinking of a x86 (32-bit) CPU to start with, since it's just further down the same processor line, and it puts me one step closer to porting ELKS to other architectures. I can rewrite as much as I need in my own repository, and I can make it architecture independent so I can still have the original IA16 port too. Of course, in each architecture there will need to still be platform-specific code to keep the resource usage as low as it already is, but (in my personal opinion) it would be worth it to have Linux anywhere I need it, especially on embedded devices.

@ghaerr
Copy link
Owner

ghaerr commented Mar 7, 2025

For a small but very powerful 32-bit Linux-like OS, I recommend Fiwix, it's fantastic and already runs a large variety of programs.

For ELKS to support a 32-bit port, it'd be a major rewrite of all the core process handling, multitasking, all assembly language, and using a new C compiler for the kernel; a huge job. If you really want to try, I recommend studying the entire kernel source code base before starting.

@ghaerr ghaerr changed the title Starting point for porting to a new architecture? Starting point for porting to a 32-bit architecture? Mar 7, 2025
@zap8600
Copy link
Author

zap8600 commented Mar 7, 2025

Fiwix looks pretty cool, but it's not specifically what I'm looking for. I'm looking to use ELKS specifically for the 512K of RAM needed at minimum to be useful. I don't exactly need ELKS on my desktop PC, but on the embedded devices I bring with me regularly, having a Linux(-like) environment would be great. I am willing to go the length to rewrite as much kernel code as needed for this.

@toncho11
Copy link
Contributor

toncho11 commented Mar 7, 2025

32 bit ... will mean more memory will be used which is in conflict with the 512 KB memory you want to use. On a 32 bit CPU you want to use the word (32 bit) which is the fastest you can access even if you do not need a 32 bit variable. I just checked - MS Windows does optimize memory to avoid wasting memory on small objects.

Ask chatgpt: "How Windows optimizes memory on 64 bit architectures when variables does not need to be 64 bits?"

Actually it depends on the software. It means programs that use a lot of 32bit ints will start working faster and the ones with many 16 bit ints just a bit slower. But memory consumption will increase for sure if you do not apply some tricks.

@zap8600
Copy link
Author

zap8600 commented Mar 7, 2025

I know that 32-bit will take more memory, but it has its optimizations just like 16-bit does. I have to rewrite most of the code anyways, so I will have to drop 16-bit optimizations and add 32-bit optimizations anyways. I don't need to make far memory reads & writes, so that part of the code can be dropped. Some 32-bit processors will have XIP, so they'll use way less memory than 512KB anyways. I'm not specifically aiming for 512KB, I used it as an example of how little ELKS needs to run. I'll start porting to either x86 or an RP2040 for an initial 32-bit port.

@djipi
Copy link

djipi commented Mar 11, 2025

Hi.
I was thinking about the same question but about M68000 architecture. Wondering if it can be doable or was already done before.
Thanks.

@toncho11
Copy link
Contributor

Hi. I was thinking about the same question but about M68000 architecture. Wondering if it can be doable or was already done before. Thanks.

I think there are too many assumptions in ELKS for the x86 architecture, but also I am not an expert.
So a better objective might be to try to compile all software for ELKS for the Motorola, by first porting ELKS libc to an existing Motorola 68000 Linux (or Linux like).

@zap8600
Copy link
Author

zap8600 commented Mar 11, 2025

To make it architecture-independent, the assumptions have to be either changed to assume a different architecture or removed and replaced with abstractions that can be filled by an architecture-specific directory.

@ghaerr
Copy link
Owner

ghaerr commented Mar 11, 2025

I was thinking about the same question but about M68000 architecture. Wondering if it can be doable or was already done before.

Rewriting ELKS to run on 68000, or any other processor, for that matter, is "doable", but no ports other than to essentially 8086-based segmented-architectures have been done. The real question is whether the ELKS source base would be a good starting point.

IMO, the answer to that depends on how much spare time you have, how much one knows about the new CPU (and 8086), and whether one has the required tools or not.

To make it architecture-independent, the assumptions have to be either changed

In order to port ELKS to a new CPU, the exact opposite needs to occur: the core kernel code needs to be rewritten in an extremely architecture-dependent way, so that the resulting OS is efficient and actually works. The porting process needs to start at the core required functionality, or nothing else will run. Most all of the source base that can be made portable has already been done, given various restrictions.

The core of ELKS (and most of OSes) functionality is a program (i.e. a piece of code) that uses a hardware interrupt in order to switch application stacks and generally implement the illusion of multi-tasking. In addition, OS services aka system calls, use the same mechanism. This code is usually written in assembler and is innately tied to the way that the CPU handles stack switching and interrupts in hardware. In ELKS, this code is the _irqit routine, and is complicated: I have written a two-part article on system calls and interrupts on our Wiki to try to describe its subtle complexities. For those wanting to port ELKS to a new CPU, I recommend you start there and fully understand how that mechanism works on 8086.

So a better objective might be to try to compile all software for ELKS for the Motorola, by first porting ELKS libc

The ELKS C library is a very portable set of code; we now have it ported to three different compiler toolchains. Starting here won't do much in porting any kernel - as the kernel uses no C library code. The ELKS C library will likely easily port to any new toolchain, and many of the ELKS user mode programs will also port easily, unless they are directly tied to the system internals.

After understanding how the core ELKS kernel time-slicing code works, one is in a much better position to get that running on a new CPU, then start adding other parts of the system to the codebase. The beginning of real coding work will require the selection of a full C toolchain - compiler, assembler, and linker, along with a target board/system to port to and a way to transfer programs back and forth (or use an emulator).

@toncho11
Copy link
Contributor

toncho11 commented Mar 11, 2025

A few starting points:

The compiler could be: https://github.com/M680x0/M680x0-llvm

You can try to reuse:

@djipi
Copy link

djipi commented Mar 11, 2025

Thank you. I do not have much spare time to spend on this, but I will have a look at the links for sure.
I know the 68000 and have past experiences with the 8086 too. If I start a "port" of elks, it will be for the 68000/68010.

@rafael2k
Copy link
Contributor

rafael2k commented Mar 14, 2025

ps: ELKS do run on Intel 32 bits (of course, using the 16-bit real mode), you don't need an emulator, and with some effort, you could use XMS for using more memory, and even unreal mode, for example, to map more memory and use 32-bit instructions, while keep running the 16-bit ELKS kernel. My ELKS development machine is a Thinkpad T430, which is ages more recent than the last 8088 computer.

@ghaerr
Copy link
Owner

ghaerr commented Mar 14, 2025

ELKS do run on Intel 32 bits

with some effort, you could use XMS for using more memory, and even unreal mode, for example, to map more memory

ELKS already does just that right now. Enable CONFIG_FS_XMS_BUFFER=y and set CONFIG_FS_NR_XMS_BUFFERS=2500 and the system will use 2.5MB of XMS memory for system buffers - resulting in super high speeds as the system runs almost entirely from memory. (Be aware though, that disk syncing is not automatic and must be enabled using sync=N seconds in /bootopts or you could lose everything in a system crash).

and use 32-bit instructions, while keep running the 16-bit ELKS kernel.

However, this feature is entirely optional. In most cases, what people really want is 32-bit applications. A big potential problem with allowing 32-bit instructions in application programs is that the program then (of course) doesn't run on older systems, and complaints start. In general, if users need the advantage of 32-bit programs, they ought to be running a whole 32-bit system, including kernel (thus my recommendation for Fiwix above).

Of course, we could allow 32-bit user applications, but that would entail getting another compiler ported, porting the C library again, only to IMO find out that most 32-bit applications would immediately run out of memory, since memory allocations would still be extremely limited, given the lack of RAM (XMS memory above can't always be directly accessed with unreal mode due to complications with certain later model PC BIOSes [e.g. Compaq 386] that run in protected mode themselves, causing conflicts with unreal mode).

In conclusion, the motivations of the ELKS Project remain trying to get big things to happen with little resources without having to have complicated multiple versions of the software for different CPUs. Otherwise, all modern systems can just boot a 32- or 64-bit OS and all the "problems" go away.

@zap8600
Copy link
Author

zap8600 commented Mar 14, 2025

Such is my goal too. I'm not looking for 32-bit applications, I'm looking for big things with little resources, however most embedded devices I own are ARM (32-bit) and not IA16. I don't intend for this to ever be merged with mainstream ELKS since it would conflict with everything and defeat the whole purpose. I just need the capabilities of ELKS on devices with little to spare without either writing everything from scratch or emulation, so I'm willing to jump through the hoops necessary to achieve this.

@toncho11
Copy link
Contributor

toncho11 commented Mar 17, 2025

I was thinking that ELKS is reaching the limit of what can be done for some applications such as Nano-X. Having several applications running simultaneously is already exhausting the entire memory. There is no chance for a graphical browser under Nano-X for example.

So is there some trick, some elegant way to access more memory on 286 and 386?
XMS in ELKS is only usable by the kernel, not the applications, right?

One way could be to implement a working swap file. On 16bit machines the swap file will be slow or even disabled, but on machines with more RAM the swap file can be placed in RAM disk beyond the 1MB. In this case the applications stay 16 bit, but the additional RAM is still used. I am sure it is not simple, because it can be a swap file per process or by smaller chunks of memory and it will be complicated.

@rafael2k
Copy link
Contributor

I'm playing a bit with Microweb and Arachne (both 16-bit software), I can see they explicitly deal with swap, XMS / EMS. I agree something like a ramdisk will suffice for software developers, while a kernel API for XMS access/copy would also be useful. In order not to introduce a lot of complexity, I'd leave swapping logic to the userland.

@ghaerr
Copy link
Owner

ghaerr commented Mar 17, 2025

I see two big problems with implementing swap on ELKS, for the reasons above to increase memory:

  1. Swapping is extremely inefficient on 8086 with no MMU since the swapped program(s) would have to be swapped back into their exact same memory locations, thus likely requiring the entire set of applications to be swapped in/out at once, thus killing any multitasking. This problem has been discussed previously.

  2. Swapping won't increase available memory for an existing memory-needing application by very much at all, since likely only /bin/init and /bin/sh would be running in a tight environment. Neither of these processes is necessary and can be disabled now without any programming, if the application is huge, without swapping.

I'd leave swapping logic to the userland.

Agreed. There are overlay linkers available for this sort of thing, and IIRC OpenWatcom library has some support for it, although likely lots of work to port that over. Writing programs to use this sort of thing is another issue entirely.

XMS in ELKS is only usable by the kernel, not the applications, right?

For the unreal mode implementation, yes. For protected-mode BIOSes, INT 15h is used by the kernel and could also likely be used by applications, possibly requiring just a small wrapper around the call.

while a kernel API for XMS access/copy would also be useful.

Yes, such a thing wouldn't be too much effort to make work, since its already being done by the kernel. The API would only allow block-copying directly between main memory and an XMS region, no direct access. As mentioned previously, the direct access can't be made to work using unreal mode since some PC BIOSes (think Compaq) already use protected mode within their BIOS, and in those cases the BIOS INT 15h protected mode XMS block copy function is used instead of unreal mode.

(Thus, programs running on a modern BIOS could probably just use the BIOS INT 15h block memory copy function now directly, without having explicit kernel support, should anyone really have programs ready to use this sort of thing).

However, the biggest issue with memory-copy solutions is that the user land program(s) in almost all cases have to be explicitly designed to work in such a way. No modern 32-/64-bit software is written in this way, as they're all using mmap() and depend on OS demand paging to do their dirty work. This is equivalent to the problem of text editors specially written to handle files larger than memory, vs those that can only handle text files that fit in main memory. There are very few of the former text editors I'm aware of.

@rafael2k
Copy link
Contributor

rafael2k commented Mar 18, 2025

Arachne browser for example allocates the memory using contexts, which relates to data dependencies of the runtime (eg. html table, images, fonts, core memory, cookies, etc), and so when swap is needed, the software knows which context is not being used and can "copy it out" to swap (be it on XMS or disk). The code is mostly ifdef'ed to be used only on DOS, but could be adapted to ELKS. The same can be said about MicroWeb I think, I still did not looked which strategy it uses, but it does do swap and/or uses XMS.

@rafael2k
Copy link
Contributor

I think ELKS is very important not only for 8088 and similar CPUs, but also 80286, as there are no modern Unix for it. I'm not sure the level of hardware support of unreal16 mode versus int 15h to access XMS, but the more ELKS provides APIs / ways for the developer to use, the better, so the software can check the machine it is running and choose the appropriate facilities for extended memory access. I'm usually giving the browser as example, but our C86 compiler is a good candidate for anyone wanting to play with swapping, for allowing it to compile larger C sources.

@toncho11
Copy link
Contributor

toncho11 commented Mar 18, 2025

So maybe a C library for swapping - an API that can be used by all applications. Maybe even this API can later be added to libc. The advantage will be that swap could have different backends. If XMS is detected then it will use XMS and not the disk (this is just an example to illustrate the idea).

So on an old system it will be slow using the disk to swap everything, but on newer systems it will use the extra memory. And it will be only for applications that know how to use it.

@rafael2k
Copy link
Contributor

I like your idea @toncho11. Some kind of API like "copy_to_extended()", "copy_from_extended()", and some init functions to declare the size of memory block that can be swapped, an identifier may be. This could be in userland, so if there are improvements / new features in the kernel, they could be easily introduced without breaking userland.

@toncho11
Copy link
Contributor

toncho11 commented Mar 18, 2025

Can the first version be exactly that: store to XMS if available, otherwise to disk in /tmp (in the future block device) ?

Some example API:
init_extended()
store_to_extended(void* src, length, key_str)
get_from_extended(void* dest, key_str, **length)

So store_to_extended copies data from the pointer at a specified length. The kernel will need to handle this copy.
Then get_from_extended copies the data to a specified void* pointer and returns the length specified before.

@ghaerr
Copy link
Owner

ghaerr commented Mar 18, 2025

Yes, such a thing wouldn't be too much effort to make work, since its already being done by the kernel.

Looking at the kernel XMS code in elks/arch/i86/mm/xms.c, it seems I got a little ahead of myself saying this won't be too much effort to get running. Close inspection shows that even the BIOS INT 15h interface requires the setup of a 80386 GDT (global descriptor table) in order to function. So read up on how 386 protected mode memory management works. The 386 unreal mode also uses a GDT to extend the "shadow" segment limit registers in the CPU beyond the normal 64K boundary. In order to describe these addresses, a "32-bit linear" address type ramdesc_t is defined, which is used in the wrapper calls to unreal mode XMS and/or INT 15 XMS memory block moves. These functions can be found in elks/arch/i86/lib/unreal.S. Get familiar with those also. The A20 line gate and other PC hardware issues are also handled in that file.

The are definite complications to getting all this to work, and now that I look at it closer, its probably better to start doing this all in user land, rather than building a kernel interface first, only to see that it won't work with the architecture of the user programs requiring it, or that the browser base code is too big to run at all.

Here are some of my further thoughts on it:

  • What will the application do when there isn't any more XMS memory? Halt/abort? Limit functionality gracefully? We don't currently even test for XMS memory available currently, just assume that 1M is available.
  • The XMS memory itself may need to be managed using an arena allocator. None of this is written. Currently, XMS buffers are handled by dividing 1M+ XMS into 1024 contiguous 1K buffers per megabyte. This won't work if the application requires a malloc-like interface.
  • What does the application do when running on a non-386 machine? Does it refuse to run, or try to run with minimal memory?
  • The application will need to test for BIOS INT 15h capabilities, to decide whether to use that or unreal mode (which also needs to be tested for, see elkscmd/sys_utils/unreal16.S).
  • Our XMS code doesn't work on 80286 processors: unreal mode won't work. 286 needs use of the undocumented LOADALL instruction, which hasn't been implemented.
  • I don't recommend trying to start by implementing both XMS copy in/out and disk in/out. Disk copying with XMS is more complicated, as it requires a kernel I/O request and the application getting switched out and stacks switched, etc. Another idea might be to start with disk in/out only, and no XMS - that can be done without any additional system calls and would work on any system (slowly).
  • I recommend starting with an API close to the user land application internal interface for memory swapping, rather than starting from scratch. This interface should be exposed and discussed, so that differences between it and the existing kernel internal XMS and BIOS 15h block copy interfaces can be made.
  • After the application(s) get the XMS running somewhat reliability, a new set of system calls can be developed to move the interface into the kernel for use by other applications.

In summary, I would suggest to keep working on getting a browser to be initially operational, then determine how it should operate with various amounts of extra memory (or not), and publish its internal interface(s) for memory management. After that and fully understanding how the internal ELKS XMS works, it can be decided what kind of (arena/linear/other) memory allocation approach could be mapped on top of an unreal/INT15 XMS implementation.

However, if the application were to use its internal interface to perform only disk <-> main memory copy in/out, this could be made to work without any of the above XMS complexities, and would not be limited to XMS memory nor arena allocation issues. It would run slowly, but would enable testing the application with much less work, it seems.

@toncho11
Copy link
Contributor

toncho11 commented Mar 18, 2025

I won't pretend I understood all the technical details :)

Is it possible to create a RAM drive block device based on XMS currently? This means that no allocator is needed, we do not care for fragmentation, etc. This can be a very good first step because our "Swap API" will have the possibility to switch between real disk drive (in /tmp) and the XMS memory for the API later. I mean a RAM drive that can be mounted and used by applications in ELKS. This RAM drive is a well defined task.

If doable it can be a bit like that "Here is some memory for you applications, you use it the way you want! (and can)"

@ghaerr
Copy link
Owner

ghaerr commented Mar 18, 2025

Is it possible to create a RAM drive block device based on XMS currently?

Actually that's a pretty good idea - since XMS access is already present in the kernel, we should be able to extend the existing RAM drive to use linear XMS memory, so that the block device essentially becomes the memory allocator. Nice. This also could fit well into the idea of the simpler first approach of using disk swapping and then just replacing the /tmp swapfile with a RAM drive. I'll look more into it :)

I'll studying a bit XSWAP implementation, described in:

Reading the assembly, it seems it uses HIMEM.SYS facilities;

Hmm, later version of HIMEM.SYS were used as the XMS allocator, rather than the older BIOS INT 15h function. But that's all RAM-based copy in/out. Does Arachne support direct to disk memory swapping without XMS?

@ghaerr
Copy link
Owner

ghaerr commented Mar 18, 2025

Reading the assembly, it seems it uses HIMEM.SYS facilities;
https://github.com/rafael2k/arachne-browser/blob/main/lopif/xmsadd.asm

Yes, that code is DOS only, as it relies on INT 2fh calls into a DOS system to handle XMS memory likely in accordance with the DOS XMS spec. It is possible that the mem_xmem1 routine(s) could be rewritten for ELKS XMS as described above. This would be much easier after an XMS API were developed for the ELKS kernel though, but would still need significant research to ensure the APIs are compatible before starting work.

Some code:
https://github.com/sraase/arachne-browser/blob/main/bufbuf.c

This code looks like it tries to implement both XMS and disk memory swapping. However, the code is so terribly ugly and hacked on I can barely follow it - and the comments all written in Czeck or Polish doesn't help. Good luck!!!

I would say you've got your work cut out for you if all Arachne code looks like this. If you can get disk swapping running using the existing source code, using @toncho11's idea of then just replacing this with and XMS ramdisk would be by far the easiest going forward. So essentially this means the path forward is to continue porting Arachne to ELKS using its existing source base and its existing disk swapping, with its XMS/HIMEM.SYS support turned off, if possible.

@toncho11
Copy link
Contributor

So there should be a /dev/xms on which mkfs will put minix and then we will be able to mount it?
Or there is a simpler way?

@rafael2k
Copy link
Contributor

Reading the assembly, it seems it uses HIMEM.SYS facilities;
https://github.com/rafael2k/arachne-browser/blob/main/lopif/xmsadd.asm

Yes, that code is DOS only, as it relies on INT 2fh calls into a DOS system to handle XMS memory likely in accordance with the DOS XMS spec. It is possible that the mem_xmem1 routine(s) could be rewritten for ELKS XMS as described above. This would be much easier after an XMS API were developed for the ELKS kernel though, but would still need significant research to ensure the APIs are compatible before starting work.

Right. I'll hold on on the XMS code for now.

Some code:
https://github.com/sraase/arachne-browser/blob/main/bufbuf.c

This code looks like it tries to implement both XMS and disk memory swapping. However, the code is so terribly ugly and hacked on I can barely follow it - and the comments all written in Czeck or Polish doesn't help. Good luck!!!

I would say you've got your work cut out for you if all Arachne code looks like this. If you can get disk swapping running using the existing source code, using @toncho11's idea of then just replacing this with and XMS ramdisk would be by far the easiest going forward. So essentially this means the path forward is to continue porting Arachne to ELKS using its existing source base and its existing disk swapping, with its XMS/HIMEM.SYS support turned off, if possible.

I agree. The code is not easy to follow indeed, but the interfaces for swapping seem a good start:
https://github.com/sraase/arachne-browser/blob/main/bufbuf.h

And if it works out-of-the-box (or almost) it is already great stuff.

@ghaerr
Copy link
Owner

ghaerr commented Mar 18, 2025

there should be a /dev/xms on which mkfs will put minix and then we will be able to mount it?

Yes, pretty much. Either a new /dev/xms or probably better using the existing /dev/rd0 or /dev/rd1 ramdisks (or /dev/ssd), which are then setup using existing utilities to create the size of the ram disk, specify XMS or not, and then the MINIX filesystem on it.

Currently we can do the following, which would then be very similar, perhaps with another option to use XMS memory instead of main memory:

ramdisk /dev/rd0 make 96       # creates 96k ram disk
mkfs /dev/rd0 96               # creates 96k MINIX filesystem
fsck -lvf /dev/rd0             # checks filesystem
mount /dev/rd0 /tmp/swapdisk   # mounts tmp filesystem

For XMS, we could probably just specify "makexms" instead of "make" in the first line above.

@toncho11
Copy link
Contributor

Yes, pretty much. Either a new /dev/xms or probably better using the existing /dev/rd0 or /dev/rd1 ramdisks (or /dev/ssd), which are then setup using existing utilities to create the size of the ram disk, specify XMS or not, and then the MINIX filesystem on it.

Currently we can do the following, which would then be very similar, perhaps with another option to use XMS memory instead of main memory:

ramdisk /dev/rd0 make 96       # creates 96k ram disk
mkfs /dev/rd0 96               # creates 96k MINIX filesystem
fsck -lvf /dev/rd0             # checks filesystem
mount /dev/rd0 /tmp/swapdisk   # mounts tmp filesystem

For XMS, we could probably just specify "makexms" instead of "make" in the first line above.

I see - ramdisk will bind /dev/rd0 to either standard RAM or XMS. Before that it is "empty" of size 0.

@ghaerr
Copy link
Owner

ghaerr commented Mar 18, 2025

if it works out-of-the-box (or almost) it is already great stuff.

The more I think about @toncho11's idea of using an XMS ram disk for handling programs that would not otherwise be able to handle larger amounts of data, the more I like it. The sole application requirement is that it contains a capability to write/swap data to disk and back (not a big technical feat, but definitely requires designing up front) - and "disk access" could be greatly sped up with the use of XMS memory with very little development, providing we look for the right applications in the first place.

This goes for editors too - there are a number of "older" editors that used disk swapping in order to handle large files, instead of the "new" way of just allocating huge amounts of memory on 32-/64-bit machine's using malloc or mmap. These editors could probably be very easily converted to work on ELKS by just changing the name of the directory used for ram/disk swapping, and having the system setup as @toncho11 is suggesting with a large RAM disk.

For instance, our own vi already allows editing of files larger than its own 64k data segment by always creating a "swap" file in /tmp. We could easily enhance vi to use "/tmp/swap" instead of "/tmp" as the temp directory for this file, and an XMS ram disk could be already mounted on /tmp/swap. The temp directory location could be set for vi and all other such programs with an environment variable TMPDIR= (which would default to /tmp if XMS not enabled) and everything would work well.

(Actually, a large RAM disk could be mounted directly on /tmp now and programs would automatically use RAM disk for their temp files, without any modifications at all).

If we replaced our current set of programs (e.g. kilo, nano, possibly edit and other editors) that only work with memory-based files with programs that used disk space when out of RAM, ELKS would be quite a bit more useful on systems without much free memory, or when running networking and two logins etc. when memory is already short.

Thinking like this, it becomes very interesting to consider replacing those programs in ELKS now that don't work well with limited memory with those that are designed to use disk swap files, with or without XMS.

@ghaerr
Copy link
Owner

ghaerr commented Mar 18, 2025

@rafael2k, this same idea can also apply to image processing. As you know, many images are far too large to be fully decoded and held in RAM before being displayed. Using image decoders that can swap scanlines to swap disk, or even better decode scanline-by-scanline for display, will allow even large images to be displayed on ELKS.

I recently used this approach with paint, where I realized even the combined (even small) images displayed on the right control panel, if held in memory, would eat up over half of paint's 64k data segment. So I took the Microwindows BMP decoder and rewrote it to fopen the image file, and decode each BMP using only a single 640x4 byte RGBA scanline buffer directly to the VGA, thus using only 2560 bytes for all image display.

Of course, BMP decoding is well suited for single scanline decoding, while JPG probably needs at least 8x or 16 scanlines for the dither blocks, and I'm not sure how PNG works. I haven't had time to rewrite the Nano-X server to use this same approach, but this solves a big problem since the current Microwindows design is to have the server hold any images in decoded RGBA internal memory, for fast blitting to the output device. A single larger image would cause the NX server to run out of memory.

This excess memory usage issue is the reason I've not been able to port the NX "launcher" application, which is similar to nxstart but uses graphical desktop icons to start applications. Even displaying each of those icons currently causes NX to run out of memory, before the user has a chance to do anything! I had previously considered linking in the images instead of loading them, but this still uses tons of memory. With porting the new single scanline BMP decoder from paint into Nano-X, we'll be able to proceed with the graphical desktop. Of course, there's a lot of other ways to run out of memory, but I wanted to mention this as a design consideration for ELKS that should pay dividends.

It would be nice to decide on a somewhat-standard "ELKS" image format, so we don't have to write tons of decoders, and possibly then use image translator tools to convert to that standard format. While BMP allows for easy scanline decoding, its pretty old. Microwindows also supports the super old PNM/PBM/PGM/PPM formats, there aren't many tools for converting them (the NX launcher icons are all in PGM or PPM format).

What are your thoughts on this? Have you looked deeper into some of the elks-viewer/ decoders to see whether they might be able to handle single-scanline decoding, or are you having to decode single images entirely into memory before displaying?

@rafael2k
Copy link
Contributor

rafael2k commented Mar 18, 2025

@rafael2k, this same idea can also apply to image processing. As you know, many images are far too large to be fully decoded and held in RAM before being displayed. Using image decoders that can swap scanlines to swap disk, or even better decode scanline-by-scanline for display, will allow even large images to be displayed on ELKS.

I recently used this approach with paint, where I realized even the combined (even small) images displayed on the right control panel, if held in memory, would eat up over half of paint's 64k data segment. So I took the Microwindows BMP decoder and rewrote it to fopen the image file, and decode each BMP using only a single 640x4 byte RGBA scanline buffer directly to the VGA, thus using only 2560 bytes for all image display.

Of course, BMP decoding is well suited for single scanline decoding, while JPG probably needs at least 8x or 16 scanlines for the dither blocks, and I'm not sure how PNG works. I haven't had time to rewrite the Nano-X server to use this same approach, but this solves a big problem since the current Microwindows design is to have the server hold any images in decoded RGBA internal memory, for fast blitting to the output device. A single larger image would cause the NX server to run out of memory.

PNG needs deflate and so is pretty memory intensive, this is why I'm holding a bit to work on PNG.

This excess memory usage issue is the reason I've not been able to port the NX "launcher" application, which is similar to nxstart but uses graphical desktop icons to start applications. Even displaying each of those icons currently causes NX to run out of memory, before the user has a chance to do anything! I had previously considered linking in the images instead of loading them, but this still uses tons of memory. With porting the new single scanline BMP decoder from paint into Nano-X, we'll be able to proceed with the graphical desktop. Of course, there's a lot of other ways to run out of memory, but I wanted to mention this as a design consideration for ELKS that should pay dividends.

Cool! Certainly it is a design consideration. I used it for the elks-viewer.

It would be nice to decide on a somewhat-standard "ELKS" image format, so we don't have to write tons of decoders, and possibly then use image translator tools to convert to that standard format. While BMP allows for easy scanline decoding, its pretty old. Microwindows also supports the super old PNM/PBM/PGM/PPM formats, there aren't many tools for converting them (the NX launcher icons are all in PGM or PPM format).

What are your thoughts on this? Have you looked deeper into some of the elks-viewer/ decoders to see whether they might be able to handle single-scanline decoding, or are you having to decode single images entirely into memory before displaying?

All the decoders are single-scanline decoders (BMP, PGM and PPM), while the JPEG is per macroblock. They are pretty small and use very little memory. ELKS image format in my opinion is BMP (with RLE support, if possible! : ) ). It supports 1-bit, 4-bit and 8bit palette modes (including RLE encoding), and also higher color modes.

@rafael2k
Copy link
Contributor

And I also like @toncho11 idea of a ramdisk on XMS. Userland which already supports swapfile would be ready to use the XMS.

@rafael2k
Copy link
Contributor

TLVC had many commits recently for adding support for HMA:

https://github.com/Mellvik/TLVC/commits/master/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants