Upcoming Posts

Upcoming Posts....

Make your own partition magic software.
How to make an assembler.
What is booting?
Write a simple OS!

‎"I have no special talents. I am only passionately curious." - Albert Einstein

Wednesday, January 23, 2013

Many MB/GB space just disappeared from my Hard disk after attaching it to the computer

Many of us must have noticed that the OS doesn't show the same size as labeled on the Hard disk. For example, if you purchase a 500 GB hard disk, OS will show it as ~465 GB. Where has this remaining space ~35Gb gone? Some people may say OS uses it for its internal work or for portioning/formatting. But that's completely incorrect.

The size difference is because of the calculation of MB/GB by Hard disk manufactures differs from the OS/software makers. The Hard Disk manufacturer follows SI decimal prefixes as mentioned below:

kilobyte (kB)         1,000
Megabyte (MB)     1,000 * 1,000
Gigabyte (MB)      1,000 * 1,000 * 1,000
...


However, OS/software makers consider the same as below:

kilobyte (kB)       1,024
Megabyte (MB)   1,024 * 1,024
Gigabyte (MB)     1,024 * 1,024 * 1,024
...


Similarly, the confusion in the meaning of megabyte was evident for many years. For instance, the 1.44 MB floppy disk's storage capacity was calculated using 1024000 bytes per "MB" (i.e. 1.44×1024×1000), rather than 1.47 MB (1.47×1000×1000) or 1.40 MiB (1.40×1024×1024).

Because of these calculation differences, we notice the big difference in size. For example, if 500 GB is labeled on hard disk, it means 500 GB in SI unit. I.e. 500 Gb = 500 * 1000 * 1000 * 1000 Bytes. Thus manufacturer gave you the hard disk of 500 * 1000 * 1000 * 1000 Bytes. The OS will show it as ~465 [ (500 * 1000 * 1000 * 100) / (1024 * 1024 * 1024) ] GB. So the disk space is not going anywhere utilized. It is getting miscalculated :).

In order to avoid this confusion, ICE in December 1998 presented a IEC binary prefixes as mentioned below:


kibilobyte (kiB)       1,024
Mebigabyte (MiB)    1,024 * 1,024
Gibigabyte (MiB)     1,024 * 1,024 * 1,024
...


Its use is presently accepted by the Institute of Electrical and Electronics Engineers (IEEE) and the International Committee for Weights and Measures (CIPM). OS and software should start using these prefix correctly to avoid such confusions. I have seen several software like FileZilla FTP client have started using MiB/GiB prefixes.

Saturday, June 9, 2012

VTABLE Table in C/C++

Non virtual functions get linked statically. I.e. their addresses along with machine code of ‘CALL’ assembly instruction are placed to make/resolve method call. However, to resolve virtual methods, compiler creates a table called VTABLE and puts address of all the virtual methods in it. With the help of this table, calls to appropriate method get resolved at runtime. Compiler places code to fetch the address of virtual method from VTABLE to make a call to the virtual method.

The VTABLE is created in the data segment, not on heap, at compile time (not at run time). Compiler creates only one VTABLE for a class containing virtual method (including inherited method) and all its object will have VPTR pointer variable pointing to the VTABLE. The VPTR variable is initialized in default constructor, copy constructor and assignment operator.

Thursday, May 31, 2012

Linux Programming Interfaces (or API)

There are two types of programming interfaces under Linux:

1) APIs for User Space Applications:

Linux provides API for user space applications (everything outside the kernel) called system calls. It is an interface between application and kernel.

Usually, we do not invoke system calls directly. Instead, we call wrapper methods like “exit()” provided by library like libc. For more information on system call and their implementations, please go through “System call” post.

2) APIs for Kernel Space Applications:

In order to make Kernel programming (writing modules/drivers) easier, Linux Kernel exports many methods including system call handlers. These methods are also called Kernel interface. To see these interfaces (exported symbols), you can run “cat /proc/kallsyms” command.

Example,

[root@linux ~]# grep sys_exit /proc/kallsyms
ffffffff810709f0 T sys_exit

“sys_exit” is an interface for kernel programmers which handle exit system call. From kernel module, we can directly call these system call handlers along with other kernel APIs.

Kernel application/programs in Linux are known as Kernel Module. These modules can be loaded and unloaded at runtime without rebooting the kernel. Once they get loaded into the memory, they become part of the kernel. Thus they can access all kernel exported methods directly.

Tuesday, April 24, 2012

Write your own Socket library in C/C++ for Linux

We have read so many things about invoking System Calls on Linux. Theories are useless unless we implement them :) (at least I believe). Thus Let's play with sockets by calling System/Kernel services directly. i.e. we will not use methods provided by C runtime libs. We will implement methods of POSIX Socket API like socket, bind etc.

Linux provides single system call called "sys_socketcall" for socket related operations. Its System call number is 102 (defined as macro __NR_socketcall in /usr/include/asm/unistd.h). Its protype defined in kernel is as follows:

long sys_socketcall(int call, unsigned long *args)

The fist parameter 'call' is an integer that identifies the specific operation to be performed. The possible values for call are defined in /usr/include/linux/net.h. Some of them are listem below:

#define SYS_SOCKET      1               /* sys_socket(2)                */
#define SYS_BIND        2               /* sys_bind(2)                  */
#define SYS_CONNECT     3               /* sys_connect(2)               */
#define SYS_LISTEN      4               /* sys_listen(2)                */
#define SYS_ACCEPT      5               /* sys_accept(2)                */

The second parameter 'args' is a pointer to an array of 'long' containing arguments for the operation.

Let's implement POSIX socket creation method "int socket(int domain, int type, int protocol)":

int my_socket(int domain, int type, int protocol)
{
long arg[3], sock = 0;

/*
   Copy 'domain', 'type' and 'protocol' in 'arg' array. This array will be passed to system call as parameter to kernel.
 */

arg[0] = domain;
arg[1] = type;
arg[2] = protocol;

/* 
long sys_socketcall(int call, unsigned long *args) ;

To call above system call, we would need to copy "__NR_socketcall" System call no. into eax register,  "SYS_SOCKET" socket operation 'call' code into ebx, and an array of 'long' containing 'domain', 'type', 'protocol' into ecx register.

The following code does the same thing. For more informaiton, find on google 'assembly using gcc' :).
*/

asm(
"int $0x80\n\t"  /* int 0x80 invokes Linux system call */
:"=a"(sock)   /* system call returns output in eax register, copy eax
                                     registers value into sock  */
:"a"(__NR_socketcall), "b"(SYS_SOCKET), "c"(arg)

/*
Above line passes "__NR_socketcall" System call no into eax register,
"SYS_SOCKET" socket operation 'call' code into ebx,
and an array of 'long' containing 'domain', 'type', 'protocol' into ecx register.
*/
); /* asm statement ends here*/

return sock; /* return socket FD*/
}

For the sake of simplicity, I haven't perfomed sanity check. While writing your own implementation, please do perform sanity check before invoking System call.

Similarly, you can define all the socket methods like bind, listen,accept, send etc and can make your own Socket library. Is not it simple? May be I have picked the simple one ;) to System Call simple.


Please leave your comments!!!

Thursday, April 19, 2012

How does 64bit Linux Kernel runs 32bit application?

I like questions which makes me think and challenge my technical ability. During a telephonic interview, interviewer asked me "How does 64bit Linux Kernel runs 32bit application?" How would you have answered this question? After taking couple of seconds, I said "That’s really a good question (was buying time J to think). Hmmm. I am not sure but I believe either 32bit C/C++ library or System call handler for 32bit will make adjustment, according to 64bit Data model, to the input/output parameters before passing them to the kernel ". Read further if you are curious to know the correct answer :):

Whenever System Call is invoked, the System Call handler method gets called. The handler validates input/output parameters and calls appropriate function which actually performs the requested operation. I.e. the handler is a kind of wrapper over actual method which processes the request.

64 bit Linux Kernel supports LP64 data model which means "long" and "pointers" will be of 64bit. I.e. The size of "Long" or "Pointer" data types should be of 64bit when a 64bit application invokes System Calls. This causes problem for 32bit applications as they consider "long" and "pointers" of 32 bit.

Because of the above mentioned difference, the 64bit kernel has to have separate system call handler method 1) for 64 bit System calls and 2) for 32bit System calls. Please note that the kernel maintain two versions of core methods which actually processes requests like opening a file. Let’s look the below picture:




The core methods of 64bit kernel expect "long" and "pointers" data types to be of 64bit size. This is ok for 64bit System Calls. However, 32bit system call has to make 32bit "long" and "pointers" types to 64bit "long" and "pointers" before passing them to kernel’s core methods. For this conversion/adjustment, Linux Kernel uses a compat (compatibility) Layer. This layer has to be the part of kernel.

In short, it is the System call handler (not C/C++ library) for 32bit which makes adjustment/conversion, according to 64bit Data model, to the input/output parameters before passing them to the kernel

Please leave comment to improve my writing skillsJ.

Wednesday, April 18, 2012

Invoking System calls on Linux

It is not possible to directly link (using any compiler) user-space applications with kernel space. For security and reliability reasons, the user-space applications must not be allowed to directly execute kernel code or manipulate kernel data. Instead, the kernel must provide a mechanism by which a user-space application can "signal" the kernel that it wishes to invoke a system call. The application can then trap into the kernel through this well-defined mechanism, and execute only code that the kernel allows it to execute. The exact mechanism varies from architecture to architecture.
According to x86-32 Linux System Call convention (on the i386 architecture), Linux System-call is done via "int 0x80" instruction and the following registers are used to pass parameters to kernel:
eax -  System call number.bx, ecx, edx, esi, edi, ebp - Used for passing 6 parameters.If there are more than six arguments, ebx must contain the memory location where the list of arguments is stored. Return Values depends on system calls. Most system calls uses EAX to return values, but not for all. If it is not there, you need further research :).
According to x86-64 Linux System Call convention, Linux System-call is done via the "syscall" instruction and the following registers used to pass parameters to kernel:
eax -  System call number.rdi, rsi, rdx, r10, r8 and r9. - Used for passing 6 parameters.
Please note that the kernel destroys registers rcx and r11. Thus you may need ot take care of the values stored rcx and r11.
Return Values depends on system calls. Most system calls uses EAX to return values, but not for all. If it is not there, you need further research :).
There are three ways to invoke System call:
1. Invoking System Calls indirectly by calling C/C++ standard methods:
exit_c.c:
void main()
{
exit(6); /* to exit current process with 6 as exit code */
}
gcc exit_c.c
./a.out
Here, we are calling C/C++ "exit()" method to terminate process with 6 as exit code. This "exit()" method present in C/C++ library will make appropriate system call to terminate the process. Since we are calling standard method here. This program will be portable and can work on all the platforms.
2. Invoking System Calls indirectly by calling C wrapper methods for system calls:
exit_ind_sys.c:
void main()
{
syscall(1,6); /* 1 is System call number for exiting program and 6 is the user defined exit status code */
}
gcc exit_ind_sys.c
./a.out
Here, "syscall()" is a GNU C library function. It is harder to use and less portable than C functions like "exit()", but easier and more portable than coding the system call in assembler instructions.
3. Invoking System Calls directly by writting assembly code:
exit_as.c:
void main()
{
asm(
"movl $1, %eax\n\t" /* 1 is System call number for exiting program */
"movl $6, %ebx\n\t" /* 6 is the user defined exit status code */
"int $0x80\n\t"     /* */
);
}
gcc exit_as.c
./a.out
Here, we are calling System API directly by executing "int $0x80" instruction from program itself. This makes it non-portable code. It will work with Linux on Intel machines only.
Getting System call numbers and their parameters:
System call numbers are defined in "/usr/include/asm/unistd.h" header file as "#define _NR_exit 1". Parameters to each system call is defined in "syscalls.h" file as "asmlinkage long sys_exit(int error_code);".
[root@localhost linux]# find / -name syscalls.h/usr/src/kernels/2.6.18-8.el5-i686/include/linux/syscalls.h


Linux Header Files:



/usr/include/asm/*.h
The Linux API ASM Headers
/usr/include/asm-generic/*.h
The Linux API ASM Generic Headers
/usr/include/drm/*.h
The Linux API DRM Headers
/usr/include/linux/*.h
The Linux API Linux Headers
/usr/include/mtd/*.h
The Linux API MTD Headers
/usr/include/rdma/*.h
The Linux API RDMA Headers
/usr/include/scsi/*.h
The Linux API SCSI Headers
/usr/include/sound/*.h
The Linux API Sound Headers
/usr/include/video/*.h
The Linux API Video Headers
/usr/include/xen/*.h
The Linux API Xen Headers
                   



Reference:
http://stackoverflow.com/questions/2535989/what-are-the-calling-conventions-for-unix-linux-system-calls-on-x86-64
http://asm.sourceforge.net/syscall.html

Tuesday, April 17, 2012

What is System Call?

The system call is the fundamental interface between an application and the operating system's kernel. It is a way to requests a service like creating/opening file from the kernel.

1. System calls are not normal method calls. No libraries or header files are supplied by the OS. i.e. stdio.h is not provided by the OS. It's provided by the C/C++ language.
2. Usually System calls are not invoked directly, but rather via wrapper functions in glibc (or perhaps some other library).
3. Input or output parameters to System calls are copied to registers or pushed onto stack.
4. System calls are OS dependent. i.e. Two operating systems may not have same System calls.

Few System calls are as follows:

Process Control - load/execute/create/terminate process etc.
File management - create/delete/open/close/read/write file etc.
Device Management - request/release/read/write devices.



It is not possible to directly link (using any compiler) user-space applications with kernel space. For reasons of security and reliability, user-space applications must not be allowed to directly execute kernel code or manipulate kernel data.


Instead, the kernel must provide a mechanism by which a user-space application can "signal" the kernel that it wishes to invoke a system call. The application can then trap into the kernel through this well-defined mechanism, and execute only code that the kernel allows it to execute. The exact mechanism varies from architecture to architecture.


Typically, System calls are implemented as software interrupt or trap. Nowadays, additional techniques like SYSCALL/SYSENTER, SYSRET/SYSEXIT (the two mechanisms were independently created by AMD and Intel, respectively), is being used.

Generally, System calls are not invoked directly, but rather via wrapper functions in glibc (or perhaps some other library). Why? Because System calls are OS dependent. Making direct System calls from the program will make it non-portable. Instead, we should call wrapper methods/API provided by Standard C/C++ libs.


Let's take an example, you'd like to create a new file and want to write your name to the file. You'll write the following C program:

#include
#include "input.c"

int main(void)
{
char name[64] = "Dew Kumar";
 FILE *playerdata;

playerdata = fopen(name, "W+"); /*create the new file*/
fgets(name,buffer,playerdata); /*write the players name to the file*/
fclose(playerdata); /*close the file*/
}

Here, fopen(), fgets() and fclose() are the library method/API which calls System call to open/write/close files.

Reference: http://en.wikipedia.org/wiki/System_call