The Rack Race update 2

There’s a new system on the block here at Oracle: the X4-8

Cut a long story short, there is 6TB in 5U (120 cores, Xeon E7-8895 v2)

In a 42U rack, that works out to be 48TB of main memory and 960 cores!

You can find more information about this system here:

https://blogs.oracle.com/hardware/entry/oracle_s_sun_server_x4

 

 

Updating Solaris 11

It used to be the case that you could happily update your OpenSolaris install for free.

Now, with Oracle Solaris you need to get a key and certificate as part of you support subscription to get the updates from Oracle.

Here’s a very brief overview of what you’ll need to do.

This is how your publisher is setup by default:

jim@solaris11:~$ pkg publisher
 PUBLISHER                             TYPE     STATUS   URI
 solaris                               origin   online   http://pkg.oracle.com/solaris/release/

You’ll need to update your publisher like this :

jim@solaris11:~$ pfexec pkg set-publisher -k Oracle_Solaris_11_Support.key.pem -c Oracle_Solaris_11_Support.certificate.pem -g https://pkg.oracle.com/solaris/support -G http://pkg.oracle.com/solaris/release/ solaris

jim@solaris11:~$ pkg publisher
 PUBLISHER                             TYPE     STATUS   URI
 solaris                               origin   online   https://pkg.oracle.com/solaris/support/

( I’m assuming you’ve already got a support subscription, and downloaded your key and certificate…)

Now you’ve done that, you can list which updates are available. In this case, we want to update the “entire” package which represents the whole OS:

jim@solaris11:~$ pkg list -af entire
 NAME (PUBLISHER)                                  VERSION                    IFO
 entire                                            0.5.11-0.175.0.9.0.5.0     ---
 entire                                            0.5.11-0.175.0.8.0.5.0     ---
 entire                                            0.5.11-0.175.0.7.0.5.0     ---
 entire                                            0.5.11-0.175.0.6.0.6.0     ---
 entire                                            0.5.11-0.175.0.5.0.5.0     ---
 entire                                            0.5.11-0.175.0.5.0.4.0     ---
 entire                                            0.5.11-0.175.0.4.0.6.0     ---
 entire                                            0.5.11-0.175.0.4.0.5.0     ---
 entire                                            0.5.11-0.175.0.3.0.4.0     ---
 entire                                            0.5.11-0.175.0.2.0.4.0     ---
 entire                                            0.5.11-0.175.0.2.0.3.0     ---
 entire                                            0.5.11-0.175.0.2.0.3.0     ---
 entire                                            0.5.11-0.175.0.1.0.5.0     ---
 entire                                            0.5.11-0.175.0.1.0.4.0     ---
 entire                                            0.5.11-0.175.0.0.0.2.0     i--
 entire                                            0.5.11-0.151.0.1.14        ---
 entire                                            0.5.11-0.151.0.1.13        ---
 entire                                            0.5.11-0.151.0.1.12        ---
 entire                                            0.5.11-0.151.0.1.11        ---
 entire                                            0.5.11-0.151.0.1.10        ---
 entire                                            0.5.11-0.151.0.1.9         ---
 entire                                            0.5.11-0.151.0.1.8         ---
 entire                                            0.5.11-0.151.0.1.7         ---
 entire                                            0.5.11-0.151.0.1.6         ---
 entire                                            0.5.11-0.151.0.1.5         ---
 entire                                            0.5.11-0.151.0.1.4         ---
 entire                                            0.5.11-0.151.0.1.3         ---
 entire                                            0.5.11-0.151.0.1.2         ---
 entire                                            0.5.11-0.151.0.1.2         ---
 entire                                            0.5.11-0.151.0.1.1         ---
 entire                                            0.5.11-0.151.0.1           ---

There is a better explanation of the entire package here:

jim@solaris11:~$ pkg info entire
 Name: entire
 Summary: Incorporation to lock all system packages to the same build
 Description: This package constrains system package versions to the same
 build.  WARNING: Proper system update and correct package
 selection depend on the presence of this incorporation.
 Removing this package will result in an unsupported system.
 Category: Meta Packages/Incorporations
 State: Installed
 Publisher: solaris
 Version: 0.5.11
 Build Release: 5.11
 Branch: 0.175.0.0.0.2.0
 Packaging Date: October 20, 2011 02:38:22 PM
 Size: 5.45 kB
 FMRI: pkg://solaris/entire@0.5.11,5.11-0.175.0.0.0.2.0:20111020T143822Z

Before you update that package, you’ll notice that you currently have a single Boot Environment (BE):

jim@solaris11:~$ beadm list
BE      Active Mountpoint Space Policy Created          
--      ------ ---------- ----- ------ -------          
solaris NR     /          4.26G static 2012-07-28 15:12

When you do update the package, a new BE will be made for you. Making a new BE means that changes aren’t made to the original; if the install goes badly, then all you need to do is boot back into the original again.

Following that, you can simply go ahead and update the package with one command! :

jim@solaris11:~$ pfexec pkg update entire@0.5.11-0.175.0.9.0.5.0
           Packages to install:   2
            Packages to update: 243
           Mediators to change:   1
       Create boot environment: Yes
Create backup boot environment:  No

DOWNLOAD                                  PKGS       FILES    XFER (MB)
Completed                              245/245   8410/8410  237.6/237.6

PHASE                                        ACTIONS
Removal Phase                              2240/2240 
Install Phase                              2541/2541 
Update Phase                               9778/9778 

PHASE                                          ITEMS
Package State Update Phase                   488/488 
Package Cache Update Phase                   243/243 
Image State Update Phase                         2/2 

A clone of solaris exists and has been updated and activated.
On the next boot the Boot Environment solaris-1 will be
mounted on '/'.  Reboot when ready to switch to this updated BE.

So now you can run /usr/sbin/beadm and see both BEs:

jim@solaris11:~$ beadm list
BE        Active Mountpoint Space Policy Created          
--        ------ ---------- ----- ------ -------          
solaris   N      /          5.34M static 2012-07-28 15:12 
solaris-1 R      -          5.70G static 2012-07-29 16:01

Decent questions, decent answers

Garbage In Garbage Out (GIGO ).  We all know that if we don’t form clear and well described questions, its not really possible to get useful answers back without first clarifying what was meant by the one that asked. With this in mind, people in the IT industry often get frustrated with poorly asked questions and are vocal about it too.

Sometimes, we find that just asking the question to ourselves outloud can produce a sensible answer. I was reading an interesting post today which offered this amusing dialogue about that scenario…It goes like this :

 

Bob pointed into a corner of the office. “Over there,” he said, “is a duck. I want you to ask that duck your question.”

I looked at the duck. It was, in fact, stuffed, and very dead. Even if it had not been dead, it probably would not have been a good source of design information. I looked at Bob. Bob was dead serious. He was also my superior, and I wanted to keep my job.

I awkwardly went to stand next to the duck and bent my head, as if in prayer, to commune with this duck. “What,” Bob demanded, “are you doing?”

“I’m asking my question of the duck,” I said.

One of Bob’s superintendents was in his office. He was grinning like a bastard around his toothpick. “Andy,” he said, “I don’t want you to pray to the duck. I want you to ask the duck your question.”

I licked my lips. “Out loud?” I said.

“Out loud,” Bob said firmly.

I cleared my throat. “Duck,” I began.

“Its name is Bob Junior,” Bob’s superintendent supplied. I shot him a dirty look.

“Duck,” I continued, “I want to know, when you use a clevis hanger, what keeps the sprinkler pipe from jumping out of the clevis when the head discharges, causing the pipe to…”

In the middle of asking the duck my question, the answer hit me. The clevis hanger is suspended from the structure above by a length of all-thread rod. If the pipe-fitter cuts the all-thread rod such that it butts up against the top of the pipe, it essentially will hold the pipe in the hanger and keep it from bucking.

I turned to look at Bob. Bob was nodding. “You know, don’t you,” he said.

“You run the all-thread rod to the top of the pipe,” I said.

“That’s right,” said Bob. “Next time you have a question, I want you to come in here and ask the duck, not me. Ask it out loud. If you still don’t know the answer, then you can ask me.”

“Okay,” I said, and got back to work.

The Rack Race update 1

Its time to revisit the the question, “How much compute/ RAM can we get in a 42U rack ? ”

Using the SunBlade 6000 chassis you can get 10 in Blades in 10U. When we fully populate the rack with T4-1B blades we get 40 SPARC procs @ 2.85GHz ( 2560 threads)

…and 256GB of RAM * 40 machines = 10240 GB of RAM i.e. 10TB !

Given the SPARC road map, and the likely doubling of the core count (achievable following a die shrink to 25NM) for T5 chips, we could see a doubling of that thread count!

Its interesting to think that these machines can saturate 10GBe easily with plenty of CPU time to spare. This in mind, that would be 80 *  10Gbe ports, so you’ll need some decent network equipment to keep all those machines fed!

 

State the problem, specify the problem

Anyone working in a services / performance tuning role will quickly understand the value of the title.

For any problem solving to start, its critical for you to have a clear idea about what you think the problem actually is. That might sound obvious to you, but its very easy to get lost in someones description of  whats happening.

Imagine the scene where you get a phone call and someone says “you need to fix the network, it seems broken…”

What does that sentence really mean? You won’t find out until you start asking sensible questions about what the stakeholder is seeing. If the user was on a 10/100 Mbit and they were seeing about 10-12 MBs a second then thats probably about as fast as its going to go (note Mbit vs MB). To get over this “problem” we have to make a decision; compress the files, use a different machine etc etc.

Until you know the root cause of the problem, you can’t suggest a fix. You can’t even mitigate the issue.

Let’s say the above network was using an ACME switch, known to not be fully non-blocking. If you don’t know how much bandwidth is in use, then can you say for sure that just swapping out the switch for a brand new switch (of the same model) is going to fix it? No. So if you did that, you’d be swapping out a perfectly ok switch for a brand new one which is going to experience exactly the same behaviour.

Below is a video which is ficticious scenario, but fairly realistic. A user calls up and complains about the website being “down”. The webserver seems to be working, but he user insists that an action be taken. The analyst doesn’t take the time to actually determine what problem the user is seeing, and instead simply complies with the request to action. Watch the video to see the rest!

Jumping to cause broke the website completely, and could have been averted if the analyst stuck to his guns and produced a proper problem statement and description.

Keep in mind that the true root cause will explain the symptoms that are seen. So when someone makes a suggestion about what the cause could be, test it against the specification and see if it seems true.

Work and motivation

So what really motivates you? Heres a fascinating little video about the use of money as a motivator for work…

New Sun/Oracle hardware

Well, its a while now but new hardware has been anounced and it really is quite impressive!

Key Stats

Theres more than the list that I’m showing, but theses are the most interesting ones. Among other bits are the X4170, X6270 and some new NEMS for the SunBlade 6000 chasis.

X4800

  • 5 Rack Units high
  • 1 TB of RAM (With 8GB DIMMS, 128 slots)
  • 4 or 8 * Xeon 7600 CPUS (each with 8 cores)
  • 8 PCIe slots
  • Up to 8 * 300GB   2.5 inch SAS-2 disks
  • Two NEMS, each with four 10Gb Ethernet ports
  • Redundant power supplies

X4470

  • 3 Rack Units high
  • 512 GB RAM (with 8GB DIMMS, 64 slots)
  • 2 or 4 Xeon 7500 series CPUS (each 8 cores)
  • 10 PCIe slots
  • Up to 6 * 300GB 2.5 inch SAS-2 disks
  • Redundant power supplies

X4170M2 / X4270 M2

  • 1 and 2 Rack units respectively
  • Up to 12 * 300GB SAS-2 2.5 inch , or up to 24 * 300GB SAS-2 2.5 inch disks
  • 2 CPUs each
  • 144GB RAM each
  • 4 Gb Ethernet onboard each

These images were shamlesly copied from www.c0t0d0s0.org, who also wrote a far better artcile than I did!

Google chrome benchmarked

Well, it appears that google chrome is faster than alot of things. Namely:

  1. Potatoes fired from a cannon
  2. Paint shot by sound wave energy
  3. Lightning striking a little ship (big ships untested)

All is explained and proved in the following little video. Credit to the editor of this filming, but I do have to say, the final test doesn’t look 100% fair, the mouse may have been clicked a split second too soon.

Robert Fisk on modern journalism

I thought I’d share a lecture given by mid-east reporter Robert Fisk. Here he expresses his views on the discourse of modern journalism, and the effects that the news have had on him. I admire his ability to not only reflect on his thoughts about current affairs, but to reflect on his own qualities as a human being. I do wonder how many people would admit to an open audience their own failings, and then follow up with their desire to correct themselves in a humble fashion.

The under appreciated bourne shell “:” operator

“:” is a little known Bourne Shell operator which is actually quite handy. However, like a alot of other short hand operators, its easy to forget, especially when its not used that much.

So what can it do?

  • You can replace the true command with it, letting you write something like:

while :
do
some_commands
done

  • Leave the then part of an if statement empty:

if :
then :
else :
fi

I agree, that example is probably useless for now, but amusing none the less.

So there you are! Next time you fancy steering from the norm of using true, you know what to do!