Blog posts & such

A ragbag of thoughts, solutions, snippets, tips, tools, notes, whys and wherefores for all things web...

If you do pretty much anything with data, learning the tar command is going to be worth your while. But suppose you do things with lots and lots of heavy data?

Well, one trick you’ll want to add to your arsenal is the –-exclude option.

Let’s say you want to package up a website with a ton of images located in an image directory. But that there are so many images located in that image directory that it would take all night to tar it, so we’re talking REALLY big.

Where there’s a will, there’s a way. You can simply exclude the unwanted baggage and tar the rest by using --exclude.

It works like this…

Let’s say you want to tar an entire website located in /httpdocs, but you want to exclude the directory located at httpdocs/files/images

It’s really just this simple…

tar -pczf mysitename.tar.gz /httpdocs --exclude "/files/images"

This will tar the whole site (which is located in /httpdocs) into a tar.gz file called mysitename.tar.gz but won’t include the directory /httpdocs/files/images.

The same works for individual files. To exclude more than one file or directory, just add more -- excludes.

For reference, here’s a bit more about some of the more common .tar commands.

  • c – create a new archive.
  • t - list the contents of an archive.
  • x - extract the contents of an archive.
  • f - the archive file name is given on the command line (required whenever the tar output is going to a file)
  • p – preserves the permissions.
  • v - print verbose output (lists file names as they are processed).
  • u - add files to the archive if they are newer than the copy in the tar file.
  • z - compress or decompress files automatically.

Open Active specializes in website development, e-commerce, branding, online and social media marketing, search engine optimization and localization for individuals and organizations large and small around the world. We specialize in launching new products on the Shopify platform!  Contact us at: +1 (212) 537-4042.
Open Active se especializa en desarrollo de sitios web, comercio electrónico, desarrollo de marcas, marketing en línea y en redes sociales, posicionamiento en buscadores, así como localización para individuos y organizaciones grandes y pequeñas en todo el mundo. ¡Nos especializamos en lanzar nuevos productos en la plataforma de Shopify!  Comuníquese con nosotros al: (212) 537-4042.
Open Active هي شركة متخصصة في تطوير مواقع الويب، والتجارة الإلكترونية، وتصميم العلامات التجارية، والتسويق عبر الإنترنت ومواقع التواصل الاجتماعي، وتحسين محركات البحث، والترجمة التي تتسم بالطابع المحلي للأفراد والمؤسسات الكبيرة والصغيرة في جميع أنحاء العالم، ونحن نتخصص في إطلاق المنتجات الجديدة على منصات Shopify.  تواصل معنا على+ (212) 537-4042.

This is the localized code for our select list switcher

<!doctype html> <html> <head> <meta charset="utf-8"> <title>Website Localization Example</title> <style> #localization { margin: 0 0 0 20px; } .localization-container { margin: 20px 0 0 0; border: 1px black dotted; padding: 20px; min-height: 100px; } .language { display: flex; align-items: center; } .localization-image { max-width: 200px; min-width: 200px; margin: 0 20px 0 0; } #arabic { direction: rtl; } #arabic .ltr { unicode-bidi: bidi-override; direction: ltr; } #arabic .localization-image { margin: 0 0 0 20px; } .show { display: initial; } .hide { display: none; } img { border: 0; max-width: 100%; height: auto; width: auto; } </style> </head> <body> <div id="localization"> <select name="list" id="list" onchange="updatelocalization()"> <option value="english">English</option> <option value="spanish">Spanish</option> <option value="arabic">Arabic</option> </select> <div class="localization-container"> <div id="english" class="show"> <div class="language"> <div class="localization-image"><img src="https://www.openactive.com/sites/images/new-york.jpg"></div> <div class="text-subcontainer"><span class="localization-text">Open Active specializes in website development, e-commerce, branding, online and social media marketing, search engine optimization and localization for individuals and organizations large and small around the world. We specialize in launching new products on the Shopify plafrom!</span>&nbsp;&nbsp;<span class="localization-contact">Contact us at: +1 (212) 123-4567.</span></div> </div> </div> <div id="spanish" class="hide"> <div class="language"> <div class="localization-image"><img src="https://www.openactive.com/sites/images/mexico-city.jpg"></div> <div class="text-subcontainer"><span class="localization-text">Open Active se especializa en desarrollo de sitios web, comercio electrónico, desarrollo de marcas, marketing en línea y en redes sociales, posicionamiento en buscadores, así como localización para individuos y organizaciones grandes y pequeñas en todo el mundo. ¡Nos especializamos en lanzar nuevos productos en la plataforma de Shopify!</span>&nbsp;&nbsp;<span class="localization-contact">Comuníquese con nosotros al: +52 55 12 34 56 78.</span></div> </div> </div> <div id="arabic" class="hide"> <div class="language"> <div class="localization-image"><img src="https://www.openactive.com/sites/images/dubai.jpg"></div> <div class="text-subcontainer"><span class="localization-text">Open Active هي شركة متخصصة في تطوير مواقع الويب، والتجارة الإلكترونية، وتصميم العلامات التجارية، والتسويق عبر الإنترنت ومواقع التواصل الاجتماعي، وتحسين محركات البحث، والترجمة التي تتسم بالطابع المحلي للأفراد والمؤسسات الكبيرة والصغيرة في جميع أنحاء العالم، ونحن نتخصص في إطلاق المنتجات الجديدة على منصات Shopify.</span>&nbsp;&nbsp;<span class="localization-contact">تواصل معنا على</span>:&nbsp;<span class="ltr">+ (212) 537-4042</span>.</div> </div> </div> </div> </div> <script type="text/javascript"> function updatelocalization(){ hideall(); var selected =document.getElementById("list").value; document.getElementById(selected).classList.remove("hide"); document.getElementById(selected).classList.add("show"); } function hideall() { document.getElementById("english").classList.remove("show"); document.getElementById("spanish").classList.remove("show"); document.getElementById("arabic").classList.remove("show"); document.getElementById("english").classList.add("hide"); document.getElementById("spanish").classList.add("hide"); document.getElementById("arabic").classList.add("hide"); } </script> </body> </html>

When doing anything from the command line, like using Mac's Terminal or running commands in Linux, dealing with spaces can be problematic.

Spaces function as a separator so something coming after a space will be treated as an independent entity across unix-based systems, like Mac and Linux. For example, if there were a folder named "Open Active" and you wanted to navigate inside using the cd command, you might be tempted to type:

cd Open Active

However, this only tells the machine to try to navigate inside a folder called "Open" in the current directory while "Active" sort of hangs off to the side like a sixth toe. Assuming you don't have a folder called "Open," you'll probably get a message something like this:

-bash: cd: Open: No such file or directory

What gives? It's that pesky space. Rather than rename your directory, there are a couple of easy ways to deal with this little problem.

Quoting

My favorite for its utter simplicity and universal applicability is to simply put file names in single quotes (') like this:

cd 'Open Active'

Single quotes is pretty much a universal way to tell a machine to deal with what's inside of the quotes just the way it is, not to do any fancy stuff.

Escaping

There is another method you should be aware of which is also a great way to deal with a whole variety of scenarios beyond spaces, but works for spaces as well, which is to use the backlash (\) as an escape. What this does is tells the machine to deal with the next character just the way it is, not to do any fancy stuff.

So, you can also deal with the aforementioned space in the file name problem like this:

cd Open\ Active

The reason that it's not quite as elegant is that if you have a directory or file name that has a lot of spaces, you'll need to do a lot of escaping. For example, if you wanted to remove a file (using the rm command) called "general instructions for how to escape a blank space.txt" You would have to escape 9 times:

rm general\ instructions\ for\ how\ to\ do\ escape\ a\ blank\ space.txt

Irritating right? Life is much easier when you just do the following:

rm 'general instructions for how to do task x.txt'

That's all there is to it. With these two arrows in your quiver, you can defeat spaces in file and directory names without breaking a sweat.

If you're running Apache (and you probably are), there is a simple built in load test that you can run to get a basic idea of how your site / server will perform. From the command line, you can run...

ab -n 100 -c 10 http://www.greencrescent.com/
  • ab = apache bench
  • n = number of requests
  • c = the number to run concurrently (at the same time)
  • http://www.greencrescent.com/ is the domain we're testing (replace with your target domain)

The output looks like...

root@server [~]# ab -n 100 -c 10 http://www.greencrescent.com/ This is ApacheBench, Version 2.3 <$Revision: 655654 $> Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/ Licensed to The Apache Software Foundation, http://www.apache.org/ Benchmarking www.greencrescent.com (be patient).....done Server Software: Apache/2.2.22 Server Hostname: www.greencrescent.com Server Port: 80 Document Path: / Document Length: 33192 bytes Concurrency Level: 10 Time taken for tests: 1.103 seconds Complete requests: 100 Failed requests: 95 (Connect: 0, Receive: 0, Length: 95, Exceptions: 0) Write errors: 0 Total transferred: 3355815 bytes HTML transferred: 3301815 bytes Requests per second: 90.63 [#/sec] (mean) Time per request: 110.335 [ms] (mean) Time per request: 11.033 [ms] (mean, across all concurrent requests) Transfer rate: 2970.20 [Kbytes/sec] received Connection Times (ms) min mean[+/-sd] median max Connect: 0 0 0.2 0 1 Processing: 56 107 40.1 98 251 Waiting: 50 92 26.5 92 229 Total: 56 107 40.1 98 251 Percentage of the requests served within a certain time (ms) 50% 98 66% 106 75% 111 80% 121 90% 188 95% 214 98% 243 99% 251 100% 251 (longest request)
Options are:
    -n requests     Number of requests to perform
    -c concurrency  Number of multiple requests to make
    -t timelimit    Seconds to max. wait for responses
    -b windowsize   Size of TCP send/receive buffer, in bytes
    -p postfile     File containing data to POST. Remember also to set -T
    -u putfile      File containing data to PUT. Remember also to set -T
    -T content-type Content-type header for POSTing, eg.
                    'application/x-www-form-urlencoded'
                    Default is 'text/plain'
    -v verbosity    How much troubleshooting info to print
    -w              Print out results in HTML tables
    -i              Use HEAD instead of GET
    -x attributes   String to insert as table attributes
    -y attributes   String to insert as tr attributes
    -z attributes   String to insert as td or th attributes
    -C attribute    Add cookie, eg. 'Apache=1234. (repeatable)
    -H attribute    Add Arbitrary header line, eg. 'Accept-Encoding: gzip'
                    Inserted after all normal header lines. (repeatable)
    -A attribute    Add Basic WWW Authentication, the attributes
                    are a colon separated username and password.
    -P attribute    Add Basic Proxy Authentication, the attributes
                    are a colon separated username and password.
    -X proxy:port   Proxyserver and port number to use
    -V              Print version number and exit
    -k              Use HTTP KeepAlive feature
    -d              Do not show percentiles served table.
    -S              Do not show confidence estimators and warnings.
    -g filename     Output collected data to gnuplot format file.
    -e filename     Output CSV file with percentages served
    -r              Don't exit on socket receive errors.
    -h              Display usage information (this message)
    -Z ciphersuite  Specify SSL/TLS cipher suite (See openssl ciphers)
    -f protocol     Specify SSL/TLS protocol (SSL2, SSL3, TLS1, or ALL)

We recently had to remove a file with a pretty odd file name: "\ я\ 005-2.jpg".

Normally, you can deal with oddities (such as spaces in file names) by just putting the file in quotes and removing normally like...

rm -rf "\ я\ 005-2.jpg"

However, when it comes to backslashes and other special characters in file names, things get tricky.

The way to remove a file like the above is to first find the Inode number. You can do this in one of two ways:

First, using \ я\ 005-2.jpg as an example, you can run stat from the command line.

stat \ я\ 005-2.jpg This will give you an output that looks like this: File: ` я 005-2.jpg' Size: 150029 Blocks: 304 IO Block: 4096 regular file Device: 10302h/66306d Inode: 13770830 Links: 1 Access: (0755/-rwxr-xr-x) Uid: ( 502/ green) Gid: ( 502/ green) Access: 2013-07-08 12:46:49.000000000 -0700 Modify: 2011-02-26 04:23:09.000000000 -0800 Change: 2013-07-11 09:47:57.000000000 -0700

A second option is to run ls -li as in...

ls -li \ я\ 005-2.jpg

This will give you an output like this:

13770830 -rwxr-xr-x 1 green green 150029 Feb 26 2011 \ я\ 005-2.jpg

In either event, you can see that the Inode value is 13770830.

To get rid of the troublesome \ я\ 005-2.jpg (a.k.a. 13770830) use the find command and tell it to remove by Inode as in -exec rm -i. For example:

find . -inum 13770830 -exec rm -rf {} \;

Farewell \ я\ 005-2.jpg, we hardly knew ya.