The find command allows users to do a comprehensive search spanning the directory tree. find also allows the setting of more specific options to filter the search results and when you’ve found what you’re looking for find even has the option to do some work on those files.

Finding Files by Age

What if a user wants to determine if there are any really old files on their server? There are dozens of options for the find command but the first thing find requires is the path in which to look.

In this example we will change our working directory to the / (root) directory and run the find command on the working directory by giving . as the path argument. The following command sequence looks for any files that are more than 20 years, 7300 days, old.

Finding files older than 20 years

# cd /
# find ./ -mtime +7300
./tmp/orbit-root
# cd /tmp
# ls -ld orbit-root
drwx------ 2 root root 8192 Dec 31 1969 orbit-root

By default find prints the name and path to any files which match the criteria listed. In this case it has found a file in ./tmp/orbit-root which has not been modified in more than 7300 days.

You’ve probably noticed that the date on this file is a bit suspect. While the details are unimportant it is worth understanding that anything on a Linux system with a date of December 31, 1969 or January 1, 1970 has probably lost its date and time attributes somehow. It may have also been created at some time when the system’s clock was horribly wrong.

If we wanted to search the root directory without changing our working directory we could have specified the directory in the find command like this:

# find / -mtime +7300
/tmp/orbit-root

The command found the same file in this case but has now described it starting with / instead of ./ because that is what was used in the find command.

The following command sequence will look for some newer files. The process starts in the user’s home directory and looks for files less than three days old.

Finding Any Files Modified in the Past 3 Days

$ cd ~
$ find . -mtime -3.
./.bash_history
./examples
./examples/preamble.txt
./examples/other.txt
./example1.fil
./.viminfo

Now we start to really see the power of the find command. It has identified files not only in the working directory but in a subdirectory as well! Let’s verify the findings with some ls commands:

$ ls –alt
total 56
drwxrwxr-x 2 tclark authors 4096 Feb 3 17:45 examples
-rw------- 1 tclark tclark 8793 Feb 3 14:04 .bash_history
drwx------ 4 tclark tclark 4096 Feb 3 11:17 .
-rw------- 1 tclark tclark 1066 Feb 3 11:17 .viminfo
-rw-rw-r-- 1 tclark tclark 0 Feb 3 09:00 example1.fil
-rw-r--r-- 1 tclark authors 0 Jan 27 00:22 umask_example.fil
drwxr-xr-x 8 root root 4096 Jan 25 22:16 ..
-rw-rw-r-- 1 tclark tclark 0 Jan 13 21:13 example2.xxx
-rw-r--r-- 1 tclark tclark 120 Aug 24 06:44 .gtkrc
-rw-r--r-- 1 tclark tclark 24 Aug 18 11:23 .bash_logout
-rw-r--r-- 1 tclark tclark 191 Aug 18 11:23 .bash_profile
-rw-r--r-- 1 tclark tclark 124 Aug 18 11:23 .bashrc
-rw-r--r-- 1 tclark tclark 237 May 22 2003 .emacs
-rw-r--r-- 1 tclark tclark 220 Nov 27 2002 .zshrc
drwxr-xr-x 3 tclark tclark 4096 Aug 12 2002 .kde
$ cd examples
$ ls -alt
total 20
drwxrwxr-x 2 tclark authors 4096 Feb 3 17:45 .
-rw-rw-r-- 1 tclark tclark 0 Feb 3 17:45 other.txt
-rw-rw-r-- 1 tclark authors 360 Feb 3 17:44 preamble.txt
drwx------ 4 tclark tclark 4096 Feb 3 11:17 ..
-rw-r--r-- 1 tclark authors 2229 Jan 13 21:35 declaration.txt
-rw-rw-r-- 1 tclark presidents 1310 Jan 13 17:48 gettysburg.txt

So we see that find has turned up what we were looking for. Now we will refine our search even further.

Finding .txt Files Modified in the Past 3 Days

Sometimes we are only concerned specific files in the directory. For example, say you wrote a text file sometime in the past couple days and now you can’t remember what you called it or where you put it. Here’s one way you could find that text file without having to go through your entire system:

$ find . -name '*.txt' -mtime -3
./preamble.txt
./other.txt

Now you’ve got even fewer files than in the last search and you could easily identify the one you’re looking for.

Find files by size

If a user is running short of disk space, they may want to find some large files and compress them to recover space. The following will search from the current directory and find all files larger than 10,000KB. The output has been abbreviated.

Finding Files Larger than 10,000k

# find . -size +10000k
./proc/kcore
./var/lib/rpm/Packages
./var/lib/rpm/Filemd5s
...
./home/stage/REPCA/repCA/wireless/USData.xml
./home/stage/REPCA/repCA/wireless/completebootstrap.xml
./home/stage/REPCA/repCA/wireless/bootstrap.xml
./home/bb/bbc1.9e-btf/BBOUT.OLD

Similarly a – could be used in this example to find all files smaller than 10,000KB. Of course there would be quite a few of those on a Linux system.

The find command is quite flexible and accepts numerous options. We have only covered a couple of the options here but if you want to check out more of them take a look at find’s man page.

Most of find’s options can be combined to find files which meet several criteria. To do this we can just continue to list criteria like we did when finding .txt files which had been modified in the past three days.

Easy Linux CommandsFor more tips like this check out my book Easy Linux Commands, only $19.95 from Rampant TechPress.

Buy it now!


Robert Vollman has now posted a review of my book Easy Linux Commands on Amazon.

He makes many good points but one I keep hearing from just about everyone is that almost all of the content of Easy Linux Commands can be applied on other UNIX and UNIX-like systems.

Here is Robert’s full review:

My shelf is full of technical books on a variety of topics, including Linux. But there have been times when someone new to the IT world will ask me for a book to get them started in a particular area. Alas, most of my books are thousand-page, exhaustively-detailed volumes that would be so inaccessible that the only use a beginner could get out of it would be to kill a few spiders.

But now, thanks to Jon Emmons and Terry Clark, I finally have a book I can give a young student, or a previously “Windows-only” PC user. “Easy Linux Commands” is just what it claims to be: an easy introduction to the command-line world.

Being easy to read and accessible is this book’s chief selling point. The book is not only under 200 pages, with lots of pictures, big text and barely 30 lines per page, but it’s also structured in the exact same familiar fashion as countless other books. Furthermore, I don’t find the author’s style overly technical. His writing style is very informal and almost conversational. Judge for yourself by visiting his blog “Life After Coffee,” where he occasionally includes excerpts from the book. In fact, if something is not clear, Jon Emmons is very accessible and answers questions quickly and happily.

http://www.lifeaftercoffee.com/

Also notice that I said this books introduces you to the command-line world, not Linux. I said that for two reasons:
1. Almost everything in this books applies equally well to Unix. Very little in this book is actually Linux-specific.
2. Even though Linux has graphical user interfaces, like Gnome and KDE, this book covers command-line Linux only.

One word of caution. Don’t be thrown by the “Become a Linux Command Guru” picture stamped on the front cover. You won’t be a guru. This covers the basics, and only a little more. But this book will get you past square one and allow you to use some of those big books for becoming a guru (instead of an exterminator).

Easy Linux CommandsCheck out my book Easy Linux Commands, only $19.95 from Rampant TechPress.

Buy it now!


linux, book

Below documents how I was able to reset the ias_admin password for an Oracle Application Server 9i instance. This may or may not work on other versions or products. If in doubt, check with support.

Oracle’s Enterprise Manager Web Site will enforce use of the current Administrator (ias_admin) password when you log in to Enterprise Manager, stop the Enterprise Manager Service, or change the ias_admin password. If you have forgotten your ias_admin password then you must reset it using the following procedure while you are logged on to your system as the person who installed Oracle Application Server:

1. Edit the following file and locate the line that defines the credentials property for use the ias_admin user:

$ORACLE_HOME/sysman/j2ee/config/jazn-data.xml

The jazn-data.xml with the credentials entry in boldface type:

[XML]
enterprise-manager


ias_admin
sMrtt1fssLblHhltt97PfnotPLwWsaFr
[/XML]

2. Remove the entire line that contains the credentials property from jazn-data.xml.

3. Set a new password with emctl set password reset new_password

I hope this helps if folks have this same problem, but as I mentioned above, mileage may vary. If you’re unsure, check with support.

oracle, application server, oas

The ls command is the main way to browse directory contents on UNIX and Linux. While it can be used with no options there are several options which will customize the output.

Using Simple ls Command Options

There will come a time when a user will want to know the last file touched, the last file changed or maybe the largest or smallest file within a directory. This type of search can be performed with the ls command. Previously the ls command was used to display directories and files within directories, but by using some of the ls command options and piping the output of ls to the head command to limit the number of displayed lines we can find some of these more specific results.

The following home directory is used for the next few examples. Using the –A option makes ls show files beginning with . but eliminates the . and .. files from the display.

$ ls -Al
total 44
-rw------- 1 tclark tclark 7773 Feb 2 17:11 .bash_history
-rw-r--r-- 1 tclark tclark 24 Aug 18 11:23 .bash_logout
-rw-r--r-- 1 tclark tclark 191 Aug 18 11:23 .bash_profile
-rw-r--r-- 1 tclark tclark 124 Aug 18 11:23 .bashrc
-rw-r--r-- 1 tclark tclark 237 May 22 2003 .emacs
-rw-rw-r-- 1 tclark tclark 0 Feb 3 09:00 example1.fil
-rw-rw-r-- 1 tclark tclark 0 Jan 13 21:13 example2.xxx
drwxrwxr-x 2 tclark authors 4096 Jan 27 10:17 examples
-rw-r--r-- 1 tclark tclark 120 Aug 24 06:44 .gtkrc
drwxr-xr-x 3 tclark tclark 4096 Aug 12 2002 .kde
-rw-r--r-- 1 tclark authors 0 Jan 27 00:22 umask_example.fil
-rw------- 1 tclark tclark 876 Jan 17 17:33 .viminfo
-rw-r--r-- 1 tclark tclark 220 Nov 27 2002 .zshrc

Finding the File Last Touched (Modified) in a Directory

The –t option is used to sort the output of ls by the time the file was modified. Then, the first two lines can be listed by piping the ls command to the head command.

$ ls -Alt|head -2
total 44
-rw-rw-r-- 1 tclark tclark 0 Feb 3 09:00 example1.fil

Using the pipe (|) character in this way tells Linux to take the output of the command preceding the pipe and use it as input for the second command. In this case, the output of ls –Alt is taken and passed to the head -2 command which treats the input just like it would a text file. This type of piping is a common way to combine commands to do complex tasks in Linux.
Finding the File with the Last Attribute Change

The –c option changes ls to display the last time there was an attribute change of a file such as a permission, ownership or name change.

$ ls -Alct|head -2
total 44
-rw-rw-r-- 1 tclark tclark 0 Feb 3 09:07 example1.fil

Again we are using the head command to only see the first two rows of the output. While the columns for this form of the ls command appear identical the date and time in the output now reflect the last attribute change. Any chmod, chown, chgrp or mv operation will cause the attribute timestamp to be updated.

Finding the File Last Accessed in a Directory

Beyond file and attribute modifications we can also look at when files were last accessed. Using the –u option will give the time the file was last used or accessed.

$ ls -Alu|head -2
total 44
-rw------- 1 tclark tclark 7773 Feb 3 08:56 .bash_history

Any of these ls commands could be used without the |head -2 portion to list information on all files in the current directory.

Finding the Largest Files in a Directory

The –S option displays files by their size, in descending order. Using this option and the head command this time to see the first four lines of output we can see the largest files in our directory.

$ ls -AlS|head -4
total 44
-rw------- 1 tclark tclark 7773 Feb 2 17:11 .bash_history
drwxrwxr-x 2 tclark authors 4096 Jan 27 10:17 examples
drwxr-xr-x 3 tclark tclark 4096 Aug 12 2002 .kde

Finding the Smallest Files in a Directory

Adding the –r option reverses the display, sorting sizes in ascending order.

$ ls -AlSr|head -4
total 44
-rw-r--r-- 1 tclark authors 0 Jan 27 00:22 umask_example.fil
-rw-rw-r-- 1 tclark tclark 0 Jan 13 21:13 example2.xxx
-rw-rw-r-- 1 tclark tclark 0 Feb 3 09:00 example1.fil

The –r option can also be used with the other options discussed in this section, for example to find the file which has not been modified or accessed for the longest time.

Use of the ls command options is acceptable when the user is just interested in files in the current working directory, but when we want to search over a broader structure we will use the find command.

Easy Linux CommandsFor more tips like this check out my book Easy Linux Commands, only $19.95 from Rampant TechPress.

Buy it now!


unix, linux, system administration, sysadmin

After my first response to Donald Burleson’s article The web is becoming a dictatorship of idiots Donald responded. Here is his response followed by my response to him.

From: Donald Burleson

Here are my guidelines for finding credible information on the web, and advice on how-to weed-out crap, sound advice.

In my opinion (and in my own interest) I think everyone should be able to publish anything at anytime.

Me to. I’m all for free speech, but it’s the search engines problem that they cannot distinguish between good and bad information. I don’t like the “clutter” it’s causing for the search engines. It ruins my ability to find credible sources of technical information, and I have to wade through pages of total crap from anonymous “experts”. For example, scumbags are stealing credible people’s content and re-publishing it in their own names, with free abandon. Look at what has been stolen from Dr. Hall.

So the system can (and will eventually) balance itself.

I disagree, not until “anon” publications and copied crap is unindexed from the search engines.

If I’m using Google to find technical information I give zero credibility to anonymous sources, and it would be great to have a “credible” way to search the web for people, so they can find stuff from folks like us, who publish our credentials.

We’re in the information age and the flood gates have opened!

Flood is the right word. Some of the Oracle “experts” who publish today would never have been able to publish in-print, and for very good reason. There are many self-proclaimed “experts” all over the web, people without appropriate education or background who would never be published in traditional media. And just like “Essjay” on Wikipedia, many of them either fabricate of exaggerate their credentials. They carefully hide their credential (resume or CV), so nobody knows the truth.

I think it’s up to culture to catch up to technology

I disagree, it’s not “culture”, it’s a simple credibility issue. And what about Wikipedia? Any 9th-grade dropout crackhead can over-write the work of a Rhodes scholar. That’s not a culture issue, it’s about credibility.

It’s a dictatorship of idiots. One bossy Wikipedia editor tossed-about his credentials (“a tenured professor of religion at a private university” with “a PhD. in theology and a degree in canon law.”), when in reality he is a college dropout, a liar and a giant loser.

Wikipedia is the enemy of anyone who wants to find credible data on the web, and they are actively seeking to pollute the web with anon garbage. Read this for details.

It’s the balance between free speech and credibility. Just the raw link-to counts are deceiving. I hear that the #1 Oracle blogger got there only because he wrote a hugely successful blog template, totally unrelated to his Oracle content quality.

The solution is simple. Sooner or later, someone will come-up with a “verified credentials” service where netizens pay a free and an independent body verifies their college degrees, published research, job experience and other qualifications.

Until then, netizens must suffer the dictatorship of idiots, never sure if what they are reading is by someone who is qualified to pontificate on the subject. I do Oracle forensics, and the courts have very simple rules to determine of someone is qualified to testify as an expert, and there is no reason that these criteria cannot be applied on the web, assigning high rank to the qualified and obscurity to the dolts. Until then we must suffer weeding through page-after-page of questionable publications in our search results.

My response

it’s the search engines problem that they cannot distinguish between good and bad information. I don’t like the “clutter” it’s causing for the search engines.

There’s no doubt that web indexing and searching is an imperfect science but identifying the quality of resources is beyond its scope. Search engines like Google, Yahoo and MSN should be considered tools to help find a site with information matching a term or pattern, not necessarily a good site.

scumbags are stealing credible people’s content and re-publishing it in their own names

Plagiarism is not a new problem and, as many have found, search engines can be instrumental in identifying plagiarism. The site Copyscape which you pointed out to me makes great use of Google’s API to do exactly that.

> So the system can (and will eventually) balance itself.

I disagree, not until “anon” publications and copied crap is unindexed from the search engines.

If I’m using Google to find technical information I give zero credibility to anonymous sources, and it would be great to have a “credible” way to search the web for people, so they can find stuff from folks like us, who publish our credentials.

And you should not give credibility to a source just because Google finds it. That’s not Google’s job. Google’s job is to find pages (every page if possible) that match the terms you’re entering. Popular sites are weighted to show up earlier in the results, but yes, only because they are popular.

Wikipedia is the enemy of anyone who wants to find credible data on the web, and they are actively seeking to pollute the web with anon garbage.

I think it’s unlikely that Wikipedia is actively trying to pollute the web. Wikipedia is fundamentally flawed for many of the reasons you mention but it remains accurate on many topics. There is no disguising of what it is and it has been largely condemned as an academic resource, but when I need a quick ‘starting point’ reference or the answer to some pop-culture trivia it’s still the place I go.

It’s the balance between free speech and credibility. Just the raw link-to counts are deceiving. I hear that the #1 Oracle blogger got there only because he wrote a hugely successful blog template, totally unrelated to his Oracle content quality.

Actually, I think you’ll find that the #1 Oracle blog you mention is actually the non-topical personal blog of an Oracle administrator. The point that he composed an attractive and well written WordPress theme is a testament to the quality of his work.

The solution is simple. Sooner or later, someone will come-up with a “verified credentials” service where netizens pay a free and an independent body verifies their college degrees, published research, job experience and other qualifications.

Verified credentials would only solve one small piece of the problem. Many people with verifiable credentials are still dead wrong and/or cannot communicate their ideas efficiently enough to be what I consider a good resource.

An even simpler solution already exists. Leading organizations like the Independent Oracle User’s Group could take it upon themselves to compile and publish lists of quality resources in their field. With some additional effort I bet these lists could be combined with Google’s search API to provide a web search which only searches a number of “verified” sites.

This type of compilation would not only provide a fantastic list of resources (especially for beginners) but would also shape search results by increasing the page ranking of sites which the organization identifies as good resources.

web2.0 web, internet, blog, wikipedia, free speach, net neutrality, online, anonymous

« Previous PageNext Page »