This blog would cater to technical aspects .. A mode to share things I find useful in my work and career
Monday, July 10, 2006
Introduction to Linux Capabilities and ACL's
Jeremy Rauch 2000-08-28
1. Introduction
Unix systems have always utilized a security system that gives normal users a minimal amount of privilege, while creating a single account, known as the 'root' account, that has full privileges. The root account is used to administer the machine, install software, create new users, and run certain services. Many common activities that require root privileges are often run as the root user, via the concept of setuid.
This dependence upon a single account for performing all actions requiring privilege has proven to be somewhat dangerous. Programs often need root privileges for a single activity, such as binding to a privileged port, or opening a file only root can access. Vulnerabilities are often found that could perhaps be eliminated if these programs didn't run as root.
In version 2.1 of the Linux kernel, work was started to add what are known as capabilities. The goal of this work was to eliminate the dependence on the root account for certain actions. As of the 2.2 kernels, this code is beginning to become useful. While problems still exist with it, it is a step in the right direction.
Another interesting project being worked on, also to compensate for some short comings in the file access control realm, is the Linux ACL project. These extend the ext2fs to allow for a finer degree of access control than is normally allowed on a Unix filesystem.
We'll briefly discuss each of these features, talk about where they may be useful, and give short examples on using them and their associated utilities. Some portions of this article will require installing kernel patches, and patching and replacing system utilities. Make sure you take extreme care when doing so, don't attempt to do so on production or other critical systems.
2. Linux Capabilities
As was briefly touched upon in the introduction, the Linux capabilities system was designed to help remove the problems associated with the need for root privileges. Before we go any further, we need to touch upon why certain privileges are reserved for root, and why certain programs need to run as root.
According to Unix security conventions, there are certain privileges reserved only for the root account. Privileged actions include things like binding a process to a privileged port, loading kernel modules, mounting and unmounting file systems, and a variety of other system activities. root is also the only account capable of adding/altering account information, installing system utilities. The privileges of root are extensive.
So what kinds of things would an average user need root access for? A simple example in the ping program. Ping needs to be able to open a raw socket in order to send packets to the network. This is an operation that only root may perform, so the ping program is installed setuid root. This allows any user to use the program, and still be able to open a raw socket. Unfortunately, this means the program has the full privileges of root. Should a flaw be discovered in the program, it could be possible for the attacker to perform any activity root can perform.
Capabilities work by breaking the actions normally reserved for root down in to smaller portions. Originally broken down in a POSIX specification, Linux has implemented 7 of the capabilities outlined in the Posix document 1003.1e, and another 20 or so Linux specific ones. Some of the more useful ones are described in the table below.
Capability Name Meaning
CAP_CHOWN Allow for the changing of file ownership
CAP_DAC_OVERRIDE Override all DAC access restrictions
CAP_DAC_READ_SEARCH Override all DAC restrictions regarding read and search
CAP_KILL Allow the sending of signals to processes belonging to others
CAP_SETGID Allow changing of the GID
CAP_SETUID Allow changing of the UID
CAP_SETPCAP Allow the transfering and removal of current set to any PID
CAP_LINUX_IMMUTABLE Allow the modification of immutable and append-only files
CAP_NET_BIND_SERVICE Allow binding to ports below 1024
CAP_NET_RAW Allow use of raw sockets
There are a number of other capabilities implemented; if you're interested in looking at them, and you have a 2.2.x kernel installed, you should be able to view them in the /usr/include/linux/capability file.
For the most part, capabilities are most useful to programmers. The use of capabilities is only beginning to trickle in to userland applications; most system utilities do not shed their root privileges. In addition, the mechanism to set capabilities on binaries on the filesystem does not exist. We should expect to actually see capabilities become useful in the 2.4 kernels. There are, however, possible uses of capabilities for system administrators.
In Linux kernels 2.2.11 and later, the concept of capability bounding sets exists. This is a table of the capabilities allowed on the system. By removing capabilities from the set, it is possible to eliminate capabilities outright. Once a capability has been removed, it can not be added again. The /proc/sys/kernel/cap-bound proc file allows the controlling of this set. The lcap program provides a clean, simple interface for removing capabilities. It is available at http://home.netcom.com/~spoon/lcap/. One of the nicer things which can now be performed is the disabling of module loading.
[root@hamachi lcap-0.0.3]# ./lcap
Current capabilities: 0xFFFEFFFF
0) *CAP_CHOWN 1) *CAP_DAC_OVERRIDE
2) *CAP_DAC_READ_SEARCH 3) *CAP_FOWNER
4) *CAP_FSETID 5) *CAP_KILL
6) *CAP_SETGID 7) *CAP_SETUID
8) *CAP_SETPCAP 9) *CAP_LINUX_IMMUTABLE
10) *CAP_NET_BIND_SERVICE 11) *CAP_NET_BROADCAST
12) *CAP_NET_ADMIN 13) *CAP_NET_RAW
14) *CAP_IPC_LOCK 15) *CAP_IPC_OWNER
16) *CAP_SYS_MODULE 17) *CAP_SYS_RAWIO
18) *CAP_SYS_CHROOT 19) *CAP_SYS_PTRACE
20) *CAP_SYS_PACCT 21) *CAP_SYS_ADMIN
22) *CAP_SYS_BOOT 23) *CAP_SYS_NICE
24) *CAP_SYS_RESOURCE 25) *CAP_SYS_TIME
26) *CAP_SYS_TTY_CONFIG
* = Capabilities currently allowed
[root@hamachi lcap-0.0.3]# ./lcap CAP_SYS_MODULE
[root@hamachi lcap-0.0.3]# ./lcap
Current capabilities: 0xFFFEFFFF
0) *CAP_CHOWN 1) *CAP_DAC_OVERRIDE
2) *CAP_DAC_READ_SEARCH 3) *CAP_FOWNER
4) *CAP_FSETID 5) *CAP_KILL
6) *CAP_SETGID 7) *CAP_SETUID
8) *CAP_SETPCAP 9) *CAP_LINUX_IMMUTABLE
10) *CAP_NET_BIND_SERVICE 11) *CAP_NET_BROADCAST
12) *CAP_NET_ADMIN 13) *CAP_NET_RAW
14) *CAP_IPC_LOCK 15) *CAP_IPC_OWNER
16) CAP_SYS_MODULE 17) *CAP_SYS_RAWIO
18) *CAP_SYS_CHROOT 19) *CAP_SYS_PTRACE
20) *CAP_SYS_PACCT 21) *CAP_SYS_ADMIN
22) *CAP_SYS_BOOT 23) *CAP_SYS_NICE
24) *CAP_SYS_RESOURCE 25) *CAP_SYS_TIME
26) *CAP_SYS_TTY_CONFIG
* = Capabilities currently allowed
As capabilities begin to enjoy more widespread use, vulnerabilities that lead to complete root access should become far less common.
3. Posix ACL's
Another interesting, although seemingly less mainstream kernel project than capabilities, is the work being done to introduce Posix file access control list (ACL) support in Linux. Those familiar with commercial Unix OS's may have used, or at least seen Posix ACL's in action.
ACL's present a way to add finer file-level access control. Whereas default Unix permissions allow for permissions to be associated with a single owner, group and the rest of the world, ACL's allow permissions to be set for multiple groups or individuals. Rather than rehash, its worthwhile to read an article on the use of ACL's under Solaris. It is available here. It covers basic use of the getfacl and setfacl utilities, both of which are implemented with an almost identical interface under Linux.
ACL's are not included by default in the Linux kernel, nor are the associated changes that need to be made to certain system utilities to support ACL's correctly. The userland ACL utilities, kernel patches, and patches to e2fsck and GNU fileutils are all available at http://acl.bestbits.at/download.html, which is the home site of the project. The e2fsprogs tarfile is available at ftp://download.sourceforge.net/pub/sourceforge/e2fsprogs/, and fileutils at ftp://ftp.gnu.org/pub/gnu/fileutils. Directions for installing the patches and building the userland utilities can be found at http://acl.bestbits.at/.
The ACL implementation is still very much a work in progress, and is not for the faint of heart; there is a decent chance you will have compilation problems. However, ACL's can go a long way in reducing the need to give people root access. Instead of having to rely on an application like sudo to allow limited root access, ACL's can instead be used to allow specific users to run a setuid application, without needing to create large quantities of groups.
4. Conclusion
A number of the projects being developed in the Linux kernel will go very far towards improving security on Linux machines. By introducing methods for reducing the dependence upon root for specific privileged actions, the possible exposure from having large quantities of programs running as root will be eliminated. If applications can only conduct the actions they are required to, the likelihood of a vulnerability allowing a user to conduct arbitrary actions as root will be much smaller. Using finer grained access control mechanisms, like those presented by ACL's will even further reduce this exposure.
Relevant Links
Capabilities FAQ
kernel.org
lcap
spoon@ix.netcom.com
Focus On Sun: ACL's
Jeremy Rauch
Linux ACL's
Andreas Grunbacher
Privacy Statement
Copyright 2006, SecurityFocus
Wednesday, May 10, 2006
Editing Kernel Parameters using sysctl
To get a quick overview of all settings configurable in the /proc/sys/ directory, type the sysctl -a command as root. This will create a large, comprehensive list, a small portion of which looks something like this:
net.ipv4.route.min_delay = 2 |
This is the same basic information you would see if you viewed each of the files individually. The only difference is the file location. The /proc/sys/net/ipv4/route/min_delay file is signified by net.ipv4.route.min_delay, with the directory slashes replaced by dots and the proc.sys portion assumed.
The sysctl command can be use in place of echo to assign values to writable files in the /proc/sys/ directory. For instance instead of using this command:
echo 1 > /proc/sys/kernel/sysrq |
You can use this sysctl command:
sysctl -w kernel.sysrq="1" |
While quickly setting single values like this in /proc/sys/ is helpful during testing, it does not work as well on a production system, as all /proc/sys/ special settings are lost when the machine is rebooted. To preserve the settings that you like to make permanently to your kernel, add them to the /etc/sysctl.conf file.
Every time the system boots, the /etc/rc.d/rc.sysinit script is executed by init. This script contains a command to execute sysctl using /etc/sysctl.conf as the values to set. Therefore, any values added to /etc/sysctl.conf will take effect after the system boots.
#sysctl
usage: sysctl [-n] [-e] variable ...
sysctl [-n] [-e] -a
sysctl [-n] [-e] -p
sysctl [-n] [-e] -A
Monday, April 10, 2006
Host based Authentication Using SSH
BACKGROUND
Installation is a two step process, taking place on the two machines.
- Machine "A" is the machine you want to connect from.
- Machine "B" is the machine you want to connect to.
On Machine "A" (The from machine), generate your key pair using ssh-keygen
ssh-keygen -t dsaThis will create two files:
~/.ssh/id_dsa
~/.ssh/id_dsa.pub
- Secure copy the id_dsa.pub key across to the local machine
- Now look at the file /~.ssh/id_dsa.pub, if it starts with "ssh-dss", ie)
ssh-dss AAAAB3NzaC1kc3MAAAC+CLO2M9OfcIjEaFBJ+cNAubJeCw8dtlHn1aKKN
3i9p4YA4w+cXVvOoD6RVD2TLudLu5av8WLiePZemUws7F4Z6hj4XHVA09Oxzneetf
9c4XoMiSLrkEaTzwFkQmefU3Jo4dQtK94rLqezd7ljs6/A91RpWSIQ0e4gYpl6fql
sUx51AAAAFQDpN1MHahy7NuCTG7g6PmsZcMN47QAAAIBhV7zbd4tPi0IqJSk3d8K4
VHb6udU+ofyTOM92E/vCO2fk392dqrxvo65ly5kYKlaMKFSYZ3GdFyAUJlf47hdra
KgoxSR6xBqin9a8vq9q5EW+hMSXAJlD1/zeXydnmuxpVTTK/Lu9yTcEKuKsiHR9Ml
XBmEqc5Cr/OQV83tehxQAAAIBIJp6sNFd4eFUxSQmfuMS56Cw5rbui8hDBNb5ViwS
LGZFxuHquCyaqr81Y4dNecNUrlU+m6cXLvMY5SlspnBTuDCKGOIQmSsoiNnjOhYO4
iWLKPN6hTYlmee+fqG2BJ24zE8sLB5t1KiqGKm4VUvaNGSDtDHMLeCz+qqH6H7LPI
A== fred@somemachine.somewhere.com
- Then append it to the file ./ssh/authorized_keys2 and ./ssh2/authorized_keys2
cat id_dsa.pub >> ~/.ssh/authorized_keys2
cat id_dsa.pub >> ~/.ssh2/authorized_keys2
- If it starts with "1024" Then append it to the file ./ssh/authorized_keys
- Change the access permission on to user read/write only
chmod 600 ~/.ssh/authorized_key*
chmod 600 ~/.ssh2/authorized_key2 - Test - should be able to ssh from the "from" machine to the "to" machine without being prompted for a password.
What to look at if it doesn't work
- Check the file ~/.ssh[2]/authorized_keys[2] do exist and have permissions of user read/write only:
machine:/home/fred/.ssh2%ls -al ~/.ssh2
If they do not have 600 permissions, ssh may refuse to authenticate using the files. (Depends on the install of ssh).
total 12
drwxr-x--x 2 fred users 4096 Sep 3 09:44 .
drwxr-xr-x 18 fred users 4096 Sep 3 09:51 ..
-rw------- 1 fred users 617 Sep 3 09:44 authorized_keys2 - Try running ssh in verbose mode to detect errors
ssh -v -l user hostname
or to generate more detailssh -vv -l user hostname
This should provide you with details on where the authentication is failing. A correctly implimented session generates the following output:MachineA:/home/fred% ssh -v MachineB
OpenSSH_3.4p1, SSH protocols 1.5/2.0, OpenSSL 0x0090602f
debug1: Reading configuration data /etc/ssh/ssh_config
debug1: Applying options for *
debug1: Rhosts Authentication disabled, originating port will not be trusted.
debug1: ssh_connect: needpriv 0
debug1: Connecting to MachineB [192.168.72.3] port 22.
debug1: Connection established.
debug1: identity file /home/fred/.ssh/identity type -1
debug1: identity file /home/fred/.ssh/id_rsa type -1
debug1: identity file /home/fred/.ssh/id_dsa type 2
debug1: Remote protocol version 1.99, remote software version OpenSSH_3.1p1
debug1: match: OpenSSH_3.1p1 pat OpenSSH_2.*,OpenSSH_3.0*,OpenSSH_3.1*
Enabling compatibility mode for protocol 2.0
debug1: Local version string SSH-2.0-OpenSSH_3.4p1
debug1: SSH2_MSG_KEXINIT sent
debug1: SSH2_MSG_KEXINIT received
debug1: kex: server->client aes128-cbc hmac-md5 none
debug1: kex: client->server aes128-cbc hmac-md5 none
debug1: SSH2_MSG_KEX_DH_GEX_REQUEST sent
debug1: expecting SSH2_MSG_KEX_DH_GEX_GROUP
debug1: dh_gen_key: priv key bits set: 145/256
debug1: bits set: 1587/3191
debug1: SSH2_MSG_KEX_DH_GEX_INIT sent
debug1: expecting SSH2_MSG_KEX_DH_GEX_REPLY
debug1: Host 'MachineB' is known and matches the RSA host key.
debug1: Found key in /home/fred/.ssh/known_hosts:11
debug1: bits set: 1522/3191
debug1: ssh_rsa_verify: signature correct
debug1: kex_derive_keys
debug1: newkeys: mode 1
debug1: SSH2_MSG_NEWKEYS sent
debug1: waiting for SSH2_MSG_NEWKEYS
debug1: newkeys: mode 0
debug1: SSH2_MSG_NEWKEYS received
debug1: done: ssh_kex2.
debug1: send SSH2_MSG_SERVICE_REQUEST
debug1: service_accept: ssh-userauth
debug1: got SSH2_MSG_SERVICE_ACCEPT
debug1: authentications that can continue: publickey,password
debug1: next auth method to try is publickey
debug1: try privkey: /home/fred/.ssh/identity
debug1: try privkey: /home/fred/.ssh/id_rsa
debug1: try pubkey: /home/fred/.ssh/id_dsa
debug1: input_userauth_pk_ok: pkalg ssh-dss blen 433 lastkey 0x808bbc0 hint 2
debug1: read PEM private key done: type DSA
debug1: ssh-userauth2 successful: method publickey
debug1: channel 0: new [client-session]
debug1: send channel open 0
debug1: Entering interactive session.
debug1: ssh_session2_setup: id 0
debug1: channel request 0: pty-req
debug1: channel request 0: shell
debug1: fd 3 setting TCP_NODELAY
debug1: channel 0: open confirm rwindow 0 rmax 32768
Last login: Tue Sep 3 12:36:15 2002 from MachineA
MachineB:/home/fred%
serverA 1024 35 140628275716542335018227053630138762374955
serverB 1024 37 168018214940842216534129401775597301238680
machine,192.168.32.1 1024 35 10756111653456055548443403148
websrv 1024 35 1075651116055548443031653434843152227399753
home 1024 37 109320343534525974618179046875249123021506535
10.3.2.1 1024 37 10920343597497631261817904687524902764535
10.3.2.2 1024 35 15939346411202234584533241802340556110376
serverC,192.168.10.2 ssh-rsa AAAAB3Nznty2btrsaC1yc2EAAABIw
machine_c ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAIEApqLnhgfwdWZgS
machine_b,10.3.2.27 ssh-rsa AAAAB3NzaC1mnhjtfc2EAAAABIwAAA
serverA:/home/fred%ssh serverBYou will be prompted to accept the fingerprint and continue to connect. After that initial connection, the host authentication should resume working
The authenticity of host 'serverB (192.168.44.2)' can't be established.
RSA key fingerprint is e3:c3:89:37:4b:94:37:d7:0c:d5:6f:9a:38:62:ce:1b.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'serverB' (RSA) to the list of known hosts.
Last login: Tue Sep 3 12:40:34 2002 from serverA
Tuesday, March 21, 2006
Useful commands
find . -name "*.trc" -mtime +1 -exec rm {} \;
You can change the +1 to ex:- +3 if you need to removes files that are 3 days old.
List files that have been modified lately and in reverse order
ls -ltr
Get a Summary of the the disk space under a given directory.
$cd foo/
$du -h -s
Need to make a copy of a directory?
Use the command
(cd /old/directory; tar cf - .) | (cd /new/directory; tar xf -)
List all Open files
Use the command
$lsof
If you need to check the all files open by a user john
$lsof -u john
Look at a log file in the reverse
tac filename | more
"tac" is the command which allows you to do that
$arch
this command list the architecture of your machine
# kernelversion
2.6
# dnsdomainname
example.com
Wednesday, March 08, 2006
Article on Journalling Filesystems - Linux Magazine
http://www.linux-mag.com/index.php?option=com_content&task=view&id=1167&Itemid=2045
| Journaling File Systems | |
| Feature Story | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Written by Steve Best | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Tuesday, 15 October 2002 | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| The file system is one of the most important parts of an operating system. The file system stores and manages user data on disk drives, and ensures that what's read from storage is identical to what was originally written. In addition to storing user data in files, the file system also creates and manages information about files and about itself. Besides guaranteeing the integrity of all that data, file systems are also expected to be extremely reliable and have very good performance. For the past several years, Ext2 has been the de facto file system for most Linux machines. It's robust, reliable, and suitable for most deployments. However, as Linux displaces Unix and other operating systems in more and more large server and computing environments, Ext2 is being pushed to its limits. In fact, many now common requirements -- large hard-disk partitions, quick recovery from crashes, high-performance I/O, and the need to store thousands and thousands of files representing terabytes of data -- exceed the abilities of Ext2. Fortunately, a number of other Linux file systems take up where Ext2 leaves off. Indeed, Linux now offers four alternatives to Ext2: Ext3, ReiserFS, XFS, and JFS. In addition to meeting some or all of the requirements listed above, each of these alternative file systems also supports journaling, a feature certainly demanded by enterprises, but beneficial to anyone running Linux. A journaling file system can simplify restarts, reduce fragmentation, and accelerate I/O. Better yet, journaling file systems make fscks a thing of the past. If you maintain a system of fair complexity or require high-availability, you should seriously consider a journaling file system. Let's find out how journaling file systems work, look at the four journaling file systems available for Linux, and walk through the steps of installing one of the newer systems, JFS. Switching to a journaling file system is easier than you might think, and once you switch -- well, you'll be glad you did. Fun with File Systems To better appreciate the benefits of journaling file systems, let's start by looking at how files are saved in a non-journaled file system like Ext2. To do that, it's helpful to speak the vernacular of file systems.
Figure Two illustrates blocks, inodes (with a number of meta-data attributes), directories, and their relationships.
When Good File Systems Go Bad With those concepts in mind, here's what happens when a three-block file is modified and grows to be a five-block file:
As you can see, while writing data to a file appears to be a single atomic operation, the actual process involves a number of steps (even more steps than shown here if you consider all of the accounting required to remove free blocks from a list of free blocka, among other possible metadata changes). If all the steps to write a file are completed perfectly (and this happens most of the time), the file is saved successfully. However, if the process is interrupted at any time (perhaps due to power failure or other systemic failure), a non-journaled file system can end up in an inconsistent state. Corruption occurs because the logical operation of writing (or updating) a file is actually a sequence of I/O, and the entire operation may not be totally reflected on the media at any given point in time. If the meta-data or the file data is left in an inconsistent state, the file system will no longer function properly. Non-journaled file systems rely on fsck to examine all of the file system's metadata and detect and repair structural integrity problems before restarting. If Linux shuts down smoothly, fsck will typically return a clean bill of health. However, after a power failure or crash, fsck is likely to find some kind of error in meta-data. A file system has a lot of meta-data, and fsck can be very time consuming. After all, fsck has to scan a file system's entire repository of meta-data to ensure consistency and error-free operation. As you may have experienced, the speed of fsck on a disk partition is proportional to the size of the partition, the number of directories, and the number of files in each directory. For large file systems, journaling becomes crucial. A journaling file system provides improved structural consistency, better recovery, and faster restart times than non-journaled file systems. In most cases, a journaled file system can restart in less than a second. Dear Journal... The magic of journaling file systems lies in transactions. Just like a database transaction, a journaling file system transaction treats a sequence of changes as a single, atomic operation -- but instead of tracking updates to tables, the journaling file system tracks changes to file system meta-data and/or user data. The transaction guarantees that either all or none of the file system updates are done. For example, the process of creating a new file modifies several meta-data structures (inodes, free lists, directory entries, etc.). Before the file system makes those changes, it creates a transaction that describes what it's about to do. Once the transaction has been recorded (on disk), the file system goes ahead and modifies the meta-data. The journal in a journaling file system is simply a list of transactions. In the event of a system failure, the file system is restored to a consistent state by replaying the journal. Rather than examine all meta-data (the fsck way), the file system inspects only those portions of the meta-data that have recently changed. Recovery is much faster, usually only a matter of seconds. Better yet, recovery time is not dependent on the size of the partition. In addition to faster restart times, most journaling file systems also address another significant problem: scalability. If you combine even a few large-capacity disks, you can assemble some massive (certainly by early-90s' standards) file systems. Features of modern file systems include:
More advanced file systems also manage sparse files, internal fragmentation, and the allocation of inodes better than Ext2. A Wealth of Options While advanced file systems are tailored primarily for the high throughput and high uptime requirements of servers (from single processor systems to clusters), these file systems can also benefit client machines where performance and reliability are wanted or needed. As mentioned in the introduction, recent releases of Linux include not one, but four journaling file systems. JFS from IBM, XFS from SGI, and ReiserFS from Namesys have all been "open sourced" and subsequently included in the Linux kernel. In addition, Ext3 was developed as a journaling add-on to Ext2. Figure Three shows where the file systems fit in Linux. You'll note that JFS, XFS, ReiserFS, and Ext3 are independent "peers." It's possible for a single Linux machine to use all of those file systems at the same time. A system administrator could configure a system to use XFS on one partition, and ReiserFS on another.
What are the features and benefits of each system? Let's take a quick look at Ext3, ReiserFS, and XFS, and then an in-depth look at JFS. EXT3 As mentioned above, Ext2 is the de facto file system for Linux. While it lacks some of the advanced features (extremely large files, extent-mapped files, etc.) of XFS and ReiserFS and others, it's reliable, stable, and still the default "out of the box" file system for all Linux distributions. Ext2's real weakness is fsck: the bigger the Ext2 file system, the longer it takes to fsck. Longer fsck times means longer down times. The Ext3 file system was designed to provide higher availability without impacting the robustness (at least the simplicity and reliability) of Ext2. Ext3 is a minimal extension to Ext2 to add support for journaling. Ext3 uses the same disk layout and data structures as Ext2, and it's forward- and backward-compatible with Ext2. Migration from Ext2 to Ext3 (and vice versa) is quite easy, and can even be done in-place in the same partition. The other three journaling file systems required the partition to be formatted with their mkfs utility. If you want to adopt a journaling file system, but don't have free partitions on your system, Ext3 could be the journaling file system to use. See "Switching to Ext3" for information on how to switch to Ext3 on your Linux machine.
The downside of Ext3? It's an add-on to Ext2, so it still has the same limitations that Ext2 has. The fixed internal structures of Ext2 are simply too small (too few bits) to capture large file sizes, extremely large partition sizes, and enormous numbers of files in a single directory. Moreover, the bookkeeping techniques of Ext2, such as its linked-list directory implementation, do not scale well to large file systems (there is an upper limit of 32,768 subdirectories in a single directory, and a "soft" upper limit of 10,000-15,000 files in a single directory.) To make radical improvements to Ext2, you'd have to make radical changes. Radical change was not the intent of Ext3. However, newer file systems do not have to be backward-compatible with Ext2. ReiserFS, XFS, and JFS offer scalability, high-performance, very large file systems, and of course, journaling. "Why Four Journaling File Systems is a Good Thing" presents an overview of the capabilities of the four journaling file systems.
REISERFS ReiserFS is designed and developed by Hans Reiser and his team of developers at Namesys. Like the other journaling file systems, it's open source, is available in most Linux distributions, and supports meta-data journaling. One of the unique advantages of ReiserFS is support for small files -- lots and lots of small files. Reiser's philosophy is simple: small files encourage coding simplicity. Rather than use a database or create your own file caching scheme, use the filesystem to handle lots of small pieces of information. ReiserFS is about eight to fifteen times faster than Ext2 at handling files smaller than 1K. Even more impressive, (when properly configured) ReiserFS can actually store about 6% more data that Ext2 on the same physical file system. Rather than allocate space in fixed 4K blocks, ReiserFS can allocate the exact space that's needed. A B* tree manages all file system meta-data, and stores and compresses tails, portions of files smaller than a block. Of course, ReiserFS also has excellent performance for large files, but it's especially adept at managing small files. For a more in-depth discussion of ReiserFS and instructions on how to install it, see "Journaling File Systems" in the August 2000 issue, available online at http://www.linux-mag.com/2000-08/journaling_01.html. JFS JFS for Linux is based on IBM's successful JFS file system for OS/2 Warp. Donated to open source in early 2000 and ported to Linux soon after, JFS is well-suited to enterprise environments. JFS uses many advanced techniques to boost performance, provide for very large file systems, and of course, journal changes to the file system. SGI's XFS (described next) has many similar features. Some of the features of JFS include:
There are other advanced features in JFS such as allocation groups (which speeds file access times by maximizing locality), and various block sizes ranging from 512-bytes to 4096-bytes (which can be tuned to avoid internal and external fragmentation). You can read about all of them at the JFS Web site at http://www-124.ibm.com/developerworks/oss/jfs. XFS A little more than a year ago, SGI released a version of its high-end XFS file system for Linux. Based on SGI's Irix XFS file system technology, XFS supports meta-data journaling, and extremely large disk farms. How large? A single XFS file system can be 18,000 petabytes (that's 1015 bytes) and a single file can be 9,000 petabytes. XFS is also capable of delivering excellent I/O performance. In addition to truly amazing scale and speed, XFS uses many of the same techniques found in JFS. Installing JFS For the rest of the article, let's look at how to install and use IBM's JFS system. If you have the latest release of Turbolinux, Mandrake, SuSE, Red Hat, or Slackware, you can probably skip ahead to the section "Creating a JFS Partition." If you want to include the latest JFS source code drop into your kernel, the next few sections show you what to do. THE LATEST AND GREATEST JFS has been incorporated into the 2.5.6 Linux kernel, and is also included in Alan Cox's 2.4.X-ac kernels beginning with 2.4.18-pre9-ac4, which was released on February 14, 2002. Alan's patches for 2.4.x series are available from http://www.kernel.org. You can also download a 2.4 kernel source tree and add the JFS patches to this tree. JFS comes as a patch for several of the 2.4.x kernel, so first of all, get the latest kernel from http://www.kernel.org. At the time of writing, the latest kernel was 2.4.18 and the latest release of JFS was 1.0.20. We'll be using those in the instructions below. The JFS patch is available from the JFS web site. You also need both the utilities (jfsutils-1.0.20.tar.gz), the kernel patch (jfs-2.4.18-patch), and the file system source (jfs-2.4-1.0.20.tar.gz). If you're using any of the latest distros, you probably won't have to patch the kernel for the JFS code. Instead, you'll only need to compile the kernel to update to the latest release of JFS (you can build JFS either as built-in or as a module). (To determine what version of JFS was shipped in the distribution you're running, you can edit the JFS file super.c and look for a printk() that has the JFS development version number string.) PATCHING THE KERNEL TO SUPPORT JFS In the example below, we'll use the 2.4.18 kernel source tree as an example on how to patch JFS into the kernel source tree. First, you need to download the Linux kernel: linux-2.4.18 .tar.gz. If you have a linux subdirectory, move it to linux-org, so it won't replaced by the linux-2.4.18 source tree. When you download the kernel archive, save it under /usr/src and expand the kernel source tree by using: % mv linux linux-org This operation will create a directory named /usr/src/linux. The next step is to get the JFS utilities and the appropriate patch for kernel 2.4.18. Before you do that, you need to create a directory for JFS source, /usr/src/jfs1020, and download (to that directory) the JFS kernel patch and the JFS file system source files. Once you have those files, you have everything you need to patch the kernel. Next, change to the directory of the kernel 2.4.18 source tree and apply the JFS kernel patch: % cd /usr/src/linux Now, you need to configure the kernel and enable JFS by going to the File systems section of the configuration menu and enabling JFS file system support (CONFIG_JFS_FS=y). You also have the option to configure JFS as a module, in which case you only need to recompile and reinstall kernel modules by typing: % make modules && make install_modules Otherwise, if you configured the JFS option as a kernel built-in, you need to: 1. Recompile the kernel (in /usr/src/linux). Run the command % make dep && make clean && make bzImage 2. Recompile and install modules (only if you added other options as modules) % make modules && make modules_install 3. Install the kernel. # cp arch/i386/boot/bzImage /boot/jfs-bzImage Next, update /etc/lilo.conf with the new kernel. Add an entry like the one that follows and a jfs1020 entry should appear at the lilo boot prompt: image=/boot/jfs-bzImage Be sure to specify the correct root partition. Then run # lilo to make the system aware of the new kernel. Reboot and select the jfs1020 kernel to boot from the new image. After you compile and install the kernel, you should compile and install the JFS utilities. Save the jfsutils-1.0.20.tar.gz file into the /usr/src/jfs1020 directory, expand it, run configure, and the install the utilities. % tar zxvf jfsutils-1.0.20.tar.gz Creating a JFS partition Having built and installed the JFS utilities, the next step is to create a JFS partition. In this exact example, we'll demonstrate the process using a spare partition. (If there's unpartitioned space on your disk, you can create a partition using fdisk. After you create the partition, reboot the system to make sure that the new partition is available to create a JFS file system on it. In our test system, we had /dev/hdb3 as a spare partition.) To create the JFS file system with the log inside the JFS partition, apply the following command: # mkfs.jfs /dev/hdb3 After the file system has been created, you need to mount it. You will need a mount point. Create a new empty directory such as /jfs to mount the file system with the following command: # mount -t jfs /dev/hdb3 /jfs After the file system is mounted, you are ready to try out JFS. To unmount the JFS file system, you simply use the umount command with the same mount point as the argument: # umount /jfs
Go Faster with An External Log An external log improves performance since the log updates are saved to a different partition than its corresponding file system. To create the JFS file system with the log on an external device, your system will need to have 2 unused partitions. Our test system had /dev/hda6 and /dev/hdb1 as spare partitions. # mkfs.jfs -j /dev/hdb1 /dev/hda6 To mount the file system use the following mount command: # mount -t jfs /dev/hda6 /jfs So you don't have to mount this file system every time you boot, you can add it to /etc/fstab. Make a backup of /etc/fstab and edit it with you favorite editor. Add the /dev/hda6 device. For example, add: /dev/hda6 /jfs jfs defaults 1 2 Not Just for Reboots Anymore Some people have the impression that journaling file systems only provide fast restart times. As you've seen, this isn't true. Considerable coding efforts have made journaling file systems scalable, reliable, and fast. Whether you're running an enterprise server, a cluster supercomputer, or a small Web site, XFS, JFS, and ReiserFS add credibility and oomph to Linux. Need a better reason to switch to a journaling file system? Just imagine yourself in a world without fsck. What will you do with all that extra time?
Steve Best works in the Linux Technology Center of IBM in Austin, Texas. He is currently working on the Journaled File System (JFS) for Linux project. Steve has done extensive work in operating system development with a focus in the areas of file systems, internationalization, and security. He can be reached at sbest@us.ibm. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Saturday, March 04, 2006
Fixing corrupted/missing Grub on MBR
How to get it back??
Its pretty Simple.
Get hold of the first disk(cdrom) of your Linux installable cdroms, I 'll consider a Redhat EL 4.0
Boot your system from the cd, type in linux rescue at the boot prompt.
You will be asked a no of questions like, locale, check the cds ...
Then you current linux installations would be searched and get mounted under /mnt/sysimage
and you will get a shell prompt, just type chroot /mnt/sysimage, to make this mount as your root system.
then just type
#/sbin/grub-install /dev/hda
(assuming the you have a ide primary hard disk)
Reboot you PC, you should see the grub boot screen..
Sunday, February 12, 2006
Using @reboot @monthly @weekly.... in CRON
String What it means
@reboot Run once, at startup.
@yearly Run once a year, "0 0 1 1 *".
@annually (same as @yearly)
@monthly Run once a month, "0 0 1 * *".
@weekly Run once a week, "0 0 * * 0".
@daily Run once a day, "0 0 * * *".
@midnight (same as @daily)
@hourly Run once an hour, "0 * * * *".
@reboot can be used a start a program on reboot.
Another possible use, could be to trigger a email notification/alert whenever a reboot is taken on
the given machine.
Thursday, January 05, 2006
Booting Redhat Linux in Rescue Mode
Rescue mode provides the ability to boot a small Red Hat Linux environment entirely from a diskette, CD-ROM, or some other boot method instead of the system's hard drive.
As the name implies, rescue mode is provided to rescue you from something. During normal operation, your Red Hat Linux system uses files located on your system's hard drive to do everything — run programs, store your files, and more.
However, there may be times when you are unable to get Red Hat Linux running completely enough to access files on your system's hard drive. Using rescue mode, you can access the files stored on your system's hard drive, even if you cannot actually run Red Hat Linux from that hard drive.
To boot into rescue mode, you must be able to boot the system using one of the following methods:
*
By booting the system from an installation boot diskette made from the bootdisk.img image. [1]
*
By booting the system from an installation boot CD-ROM. [2]
*
By booting the system from the Red Hat Linux CD-ROM #1.
Once you have booted using one of the described methods, enter the following command at the installation boot prompt:
linux rescue
You are prompted to answer a few basic questions, including which language to use. It also prompts you to select where a valid rescue image is located. Select from Local CD-ROM, Hard Drive, NFS image, FTP, or HTTP. The location selected must contain a valid installation tree, and the installation tree must be for the same version of Red Hat Linux as the Red Hat Linux CD-ROM #1 from which you booted. If you used a boot CD-ROM or diskette to start rescue mode, the installation tree must be from the same tree from which the media was created. For more information about how to setup an installation tree on a hard drive, NFS server, FTP server, or HTTP server, refer to the Red Hat Linux Installation Guide.
If you select a rescue image that does not require a network connect, you are asked whether or not you want to establish a network connection. A network connection is useful if you need to backup files to a different computer or install some RPM packages from a shared network location, for example.
You will also see the following message:
The rescue environment will now attempt to find your Red Hat
Linux installation and mount it under the directory
/mnt/sysimage. You can then make any changes required to your
system. If you want to proceed with this step choose
'Continue'. You can also choose to mount your file systems
read-only instead of read-write by choosing 'Read-only'.
If for some reason this process fails you can choose 'Skip'
and this step will be skipped and you will go directly to a
command shell.
If you select Continue, it will attempt to mount your file system under the directory /mnt/sysimage. If it fails to mount a partition, it will notify you. If you select Read-Only, it will attempt to mount your file system under the directory /mnt/sysimage, but in read-only mode. If you select Skip, your file system will not be mounted. Choose Skip if you think your file system is corrupted.
Once you have your system in rescue mode, a prompt appears on VC (virtual console) 1 and VC 2 (use the [Ctrl]-[Alt]-[F1] key combination to access VC 1 and [Ctrl]-[Alt]-[F2] to access VC 2):
-/bin/sh-2.05b#
If you selected Continue to mount your partitions automatically and they were mounted successfully, you are in single-user mode.
Even if your file system is mounted, the default root partition while in rescue mode is a temporary root partition, not the root partition of the file system used during normal user mode (runlevel 3 or 5). If you selected to mount your file system and it mounted successfully, you can change the root partition of the rescue mode environment to the root partition of your file system by executing the following command:
chroot /mnt/sysimage
This is useful if you need to run commands such as rpm that require your root partition to be mounted as /. To exit the chroot environment, type exit, and you will return to the prompt.
If you selected Skip, you can still try to mount a partition manually inside rescue mode by creating a directory such as /foo, and typing the following command:
mount -t ext3 /dev/hda5 /foo
In the above command, /foo is a directory that you have created and /dev/hda5 is the partition you want to mount. If the partition is of type ext2, replace ext3 with ext2.
If you do not know the names of your partitions, use the following command to list them:
fdisk -l
From the prompt, you can run many useful commands such as
*
list-harddrives to list the hard drives in the system
*
ssh,scp, and ping if the network is started
*
dump and restore for users with tape drives
*
parted and fdisk for managing partitions
*
rpm for installing or upgrading software
*
joe for editing configuration files (If you try to start other popular editors such as emacs, pico, or vi, the joe editor will be started.)