Saturday, December 10, 2011

How to Learn WebSphere in 31 Days - Part Four: A Refresher of UNIX, TCP/IP, and Networking

-->
In this section, we review UNIX, TCP/IP, and networking topics important to WebSphere system engineering. First, we explain why these topics are critical to a WebSphere system engineer. Then, we will cover these topics as a refresh or review focusing on the topics critical to a WebSphere system engineer in successfully doing his or her job. We assume that you are familiar to UNIX, TCP/IP, and networking. This section is not a systematic coverage of UNIX, TCP/IP, or networking.

UINX, TCP/IP, and networking are important to a WebSphere System Engineer

As a WebSphere engineering manager working in this field for more than ten years, I have hired many WebSphere engineers. When I review the experience and technical skills of a candidate, I focus on the following areas.

  • Operating System skills, especially UNIX
  • Networking skills including TCP/IP
  • Programming skills in Java, UNIX shell scripting, Jython or Python
  • WebSphere Application Servers skills


A very large number of WebSphere systems run on UNIX platforms. UNIX skills is mandatory for a WebSphere system engineer to perform his or her daily job function. For instance, a WebSphere system engineer has to know the locations of WebSphere system files as well as the locations of JEE applications in a UNIX environment. To access systems and application files and perform WebSphere system operations, you have to know how to identify and manage the security of UNIX systems. Experienced WebSphere system engineers have a script for almost every possible WebSphere system operation, therefore, learning how to program shell scripts (as well as WebSphere automation programs in Jython or Python – we will over this later) and how to schedule and run these scripts as UNIX jobs are imperative to be a competitive WebSphere system engineer. Last, but not least, a WebSphere system engineer, in routine system operations or in problem troubleshooting, must be proficient in using a number of UNIX commands, for example, a UNIX command to determine if there is enough disk space for WebSphere to run.

UNIX topics 

First, let's review the following UNIX topics.

  • UNIX File System – the location of WebSphere system files and JEE application files as well as a review of typical setup of UNIX file system
  • UNIX Security Model – the topics that are important to WebSphere system engineers
  • UNIX Commands – a group of UNIX commands frequently used
  • Shell Scripting – scripting in terms of WebSphere system operation automation 

UNIX File System 

With UNIX file system and security model, our objective is to identify where are what WebSphere system and application files and how to get access them to get the job done.
As you may recall that the actual locations and names of certain UNIX system files varies with different implementations of UNIX. The following are examples of UNIX system directories:
  • /bin/ stores executable and common system utilities, like ls, cp, and rm.
  • /etc/ is the location of system configuration files and databases. For example, crontabs related files are stored here for some UNIX implementation.
  • /var/spool/cron/crontabs is where Ubuntu keeps the crontab related files.
  • /usr/bin/ has additional user commands.
  • /usr/lib/ stores more programming and system call libraries.
  • /usr/local/ is typically a place where local utilities go.
  • /usr/man/ keeps the UNIX manual pages.
  • /opt/IBM/WebSphere is the usual WebSphere directory as recommended by IBM. However, different companies have different enterprise WebSphere system standards that decide where the WebSphere system files should be installed.

Most frequently used WebSphere Application Server directories are listed below. The WebSphere Application Servers system file structure changes some with different releases, but is mainly stable. The following is accurate for WebSphere Application Server 8. The most used are tools, logs files, property files, and configuration files.
 
The tools are invariably in a bin directory. For example, if you have forgotten your password, you can go to the bin directory and use the following steps to disable security.
First, Go to /opt/IBM/WebSphere/AppServer/profiles/AppSrv01/bin/you can use wsadmin.
Then, type in the following commands, if you use Ubuntu:
  • sudo ./wsadmin.sh -conntype NONE
  • securityoff
  • recycle server

Here is another example, use shell script to start your server in the following directory to start a server.
/opt/IBM/WebSphere/AppServer/profiles/AppSrv01/bin/

If you are working on Ubuntu, use this the following command.
sudo ./startserver.sh server1

While you are starting the server, you can use another windows to traverse into the log directory to monitor the log file to see how well the start of the server is going.
/opt/IBM/WebSphere/AppServer/profiles/AppSrv01/logs/AppSrv01/
You can use the following command, if you use Ubuntu.
sudo ./tail -f startServer.log

For configuration files, you can find them in the following directory.
/opt/IBM/WebSphere/AppServer/profiles/AppSrv01/config/

If you forget your password and username for WebSphere, then you have to modify security.xml file directly to remove security. You can go to the following directory.
/opt/IBM/WebSphere/AppServer/profiles/AppSrv01/config/cells/Node01Cell/
  • Locate security.xml and open the file in a text editor.
  • Find the first "enabled" string and change the "true" string to "false".
  • Then, restart the server without the username and password.

Property files can be found in the following directory.
/opt/IBM/WebSphere/AppServer/profiles/AppSrv01/properties/

wsadmin default scripting language is jacl. Therefore, each time you want to use a script written in Jython, you have to specify the language choice by using the following command if you use Ubuntu (we will talk more about jython and wsadmin later).
Sudo ./wsadmin.sh -lang Jython

However, sometimes you may forget and as a result, your Jython script may not work and the troubeshooting may cost your 15 minutes. You can modify the wsadmin.properties file to make the default language choice Jython.

First, go to the following directory.
/opt/IBM/WebSphere/AppServer/profiles/AppSrv01/properties/

Locate and open wsadmin.properties file using a text editor such as vi.

Find the following line and change jacl with Jython.
com.ibm.ws.scripting.defaultLang=jython

UNIX security and WebSphere
-->
Now let's go over how to set up security to access system and application files and related topics. 
 
-->
User ID and Password

When you join a new WebSphere engineering team, you need to quickly learn how to log into the WebSphere systems and perform your job functions as a WebSphere engineer. You need at least three WebSphere system IDs. You want to apply for these system IDs and understand how to use them to access your WebSphere systems as soon as possible. Because companies manage system access differently, there is always a small learning curve to get over with, even for very experienced WebSphere engineers.

·      UNIX system logon ID – this ID is what you use to log into the UNIX servers to run WebSphere Application Server.  

o   Depending on the information security policy and team structure, this UNIX ID may or may not give you root access. If you work for a company where the WebSphere system engineers play both the roles of UNIX system administrator and WebSphere system administrator, your UNIX ID usually give you root access.

o   For the ease of system administration work and the consistency of security policy, your UNIX system ID typically belong to a group that has full system privileges to the WebSphere system files or root access if the WebSphere engineering team is supposed to be their own UNIX system administrator.

·      WebSphere Administrative Console ID – this is what you use to access WebSphere administrative console.

o   You can disable WebSphere security if you forget this ID or password for the WebSphere administrative console via using wsadmin facility. Then, you can log into the system and enable security.

o   Again, your administrative console belongs to a “job code” group that you can apply for to the information security team in order to add your ID. WebSphere logs the activities of this ID.

·      WebSphere Application Server ID – this is the UNIX ID that the WebSphere Application Server uses to run in the UNIX environment. Usually, it is wasadmin. Sometimes, this ID is useful and safer when you need to log into the system to work on WebSphere Application Server.  This is especially true if you have root access because the use of this ID limits your use of root privileges in case you accidentally make a mistake.

o   sudo su – wasadmin

o   WebSphere Application Server ID and password are “public information” with the WebSphere engineering team.  Therefore, be careful and honest when use this ID to do system work. All companies have software such as PowerBroker that watches every command you use. The monitoring software can trace back to your own UNIX ID for every command that you issue.

o   You can use wasadmin and password to launch scripts through wsadmin port from a remote computer. This use of wasadmin application ID is not being traced today. If you see system changes that cannot be explained via auditing system access logs of WebSphere administrative console or UNIX system access auditing logs, this is what is an area of possible root cause to explore.          

·      In a “security hardened” environment, all security objects such as user name, UNIX system ID, among others, are consistently managed across the enterprise using a directory service, usually a kind of LDAP server. The availability of the LDAP service is therefore not only important to your system access, also it is critical for your WebSphere Application Server.

·      If your organization does not give WebSphere engineering team root access, before you perform WebSphere Application Server installation work, usually you need to apply for a temporary root access. The security team will then add you to a group that has a modified set of root privilege to enable you with the installation work. Usually, such root privileges have a time constraint and will be revoked after a time period set by your organization.

·      Security software such as PowerBroker for UNIX from BeyondTrust.com records every command that you issue after logging into the system. Frequently, the log file is encrypted and shipped to a secure location beyond your manipulation and modification. In the case of serious accidents when the company sustains heavy losses, this type of log files are retrieved and analyzed to determine what happened.  
 
-->
Frequently used UNIX command

Let's review UNIX commands frequently used by WebSphere system engineers. I group the commands a few functional groups and use Ubuntu as examples. A different UNIX product such as AIX may have different commands. However, it is important to learn how to use this type of commands to do what you need to do as a WebSphere engineer.
 
Know what WebSphere and Java software are running
New to a WebSphere system, you want to know quickly what are running.
Directory
Ubuntu Command
Purpose
Example
/opt/IBM/WebSphere/AppServer/bin

versionInfo.sh

Determine WebSphere product version
sudo ./versionaInfo.sh
cd/usr/sbin

apachectl –v
Determine Apache web Server version
Sudo ./ apachectl –v

java
Determine Java vefrsion
java –version

ps
Determine what servers are running
ps -ef | grep java

-->
Know the platform
Frequently, you need to know what platform you are working. To be practical, I list relevant AIX commands as well.
Ubuntu command
Purpose
Ubuntu example
AIX command
AIX example
cat /proc/cpuinfo
View all processors, clock speeds, flags, and more
cat /proc/cpuinfo
prtconf -c
prtconf -c
cat /proc/meminfo

View amount of RAM and swap, and how much is being used for what
cat /proc/meminfo

prtconf -m
prtconf -m
df
View disk space and usage
df -H
df -H
df -H

Display the kernel in use, for example, 32-bit or 64-bit

prtconf -k
prtconf -k
lshw
Display hardware
sudo lshw
unameuname -a


Know system performance
You need to know how the system is doing, for example, how the CPU, memory, network, or storage is doing, especially in troubleshooting.
-->
-->
Command
Purpose
Example
Notes
top
Overall system usage by process
top
Some organization disables this command
vmstat
Virtual memory statistics
vmstat -a
You can use switch to select what you need
netstat
The default display for network
netstat -i

ps
Process system usage
ps –ef
Display every process using standard format
df
Disk usage statistics
df –k
Block size – 1K
du
Displays a summary of disk usage
du -a
Display the disk usage of each file

lsof

Information about files opened by processes
lsof
lsof
iostat

I/O statistics
Iostat 1 5
5 reports every 1 second
 
Work with processes
You sometimes need to identify a WebSphere process and use kill command to either produce dumps or kill the process when it becomes unresponsive.
-->
Command
Purpose
Example
Note
ps –ef | grep java
Find the process
ps –ef | grep java
This returns a lot of information on the process.
kill -3  PID
Terminate with core dump
kill -3 13455
13455 is a sample PID
kill -9 PID
Forced termination
kill -9 13455
13455 is a sample PID

-->
Work with log files
To be effective in working with log files, you want to be fluent in one of the editors. If you cannot decide which editor you like, you can try vi that is available with almost all flavors of UNIX.
Command
Purpose
Example
Note
head -number
View n lines of log file from the beginning
head -10 logfile1203.log

tail - number
View n lines of log file from the end
tail -10 logfile1203.log
tail -f startServer.log
"tail –f  starServer.log" will continuously view the activities in the log file
find
Find the log files
find / -name "*.log" | -print
Find the log files
Command (vi)
Purpose
Example
Note
/STRING
Search downward in the log file for the STRING
/OutofMemoryEerror
In vi editor, type “/” then, type the string to search
?STRING
Search upward in the log file for the STRING
/OutofMemoryError
In vi editor, type “?” then, type the string to search
n
repeat last search from present position
n
In vi editor, type “n” to repeat the last search
Ctrl-g
Show line number of current line
Ctrl-g

G
Move the cursor to the end of the file
G

Ctrl-b
One page up
Ctrl-b

Ctrl-f
One page down
Ctrl-f



To get more WebSphere related examples, I will delay the discussion of shell scripting and running scripts as cron job later when we have covered some WebSphere topics. 

-->
TCP/IP and Networking


WebSphere Application Server provides an execution environment for a JEE application to run on either Internet or Intranet. A typical deployment of WebSphere Application Server is usually integrated with a number of upstream and downstream systems in a network environment connected to the intranet or the Internet or both. Hence comes the need for a WebSphere system engineer to learn networking.

As a typical setup, the web browser communicates with the WebSphere Application Server via geographical load balancer such as 3DNS. 3DNS balances load between data centers. Then, the traffic is load balanced by local load balancers such as BIG 5. The firewalls that form de-militarized zones (DMS) protect the WebSphere Application Server system along with backend enterprise systems. The WebSphere Application Server is integrated with databases using database drivers through network as shown below.

User web browser->
------ Internet ------
Geographical load balancers ->
------- Firewall -------
Local load balancers ->
Web servers ->
------- Firewall -------
WebSphere Application Servers ->
A variety of enterprise systems



To work effectively in this networked WebSphere environment, there is a need to be able to clearly identify HTTP, HTTPS, virtual host, port, SSL, certificate, firewall, load balancer, among others. Let’s take an approach of traversing through each layer of this typical WebSphere architecture and go over the key concepts and skill as we go.

-->
-->
HTTP

I heard of HTTP for the first time in 1993 from a fellow graduate student. She talked about her thesis and told me about HTML document. She was interested in the possibility of using HTML to create computer-based training on the server. I asked her how the students could get to the HTML documents on the server. She mentioned access to the server. Of course, it was till 1995 HTTP gained popularity as a TCP based transport with the introduction of browser.
As a WebSphere engineer, you want to be able to identify the following.
HTTP protocol
HTTP session
HTTP port
HTTP status code
What a WebSphere engineer does with HTTP
HTTP protocol

HTTP is a request-response network protocol to deliver files and other data from the server to the client. Usually, HTTP uses socket to deliver resources requested.

HTTP session 

HTTP session is a delivery channel established between the server and the client. A sequence of request and response is carried via this channel before the session is destroyed. The HTTP session is needed to overcome the stateless nature of request and response protocol, since it is important to for the server to retain the state of the client and its requests.

HTTP port

The client makes a request to the server and initiatives a TCP connection via a particular port, usually port number 80. The server starts to listen this port for client request. As soon as the server receives a request, it sends back a status line and a message with resource requested as the body.

HTTP status code

The following are the most frequently seen status codes.
200 OK - The request succeeded, and the requested resource is returned.
404 Not Found - The requested resource can’t be found.
301 Moved Permanently 
302 Moved Temporarily 
303 The resource has moved to another URL
500 Server Error - An unexpected server error has occurred.

What to do with HTTP

A WebSphere system engineer frequently works on the following HTTP related tasks.

o   Configure HTTP server, for example, Apache web server or IBM HTTP Server (HIS)
o   Design and implement the location and the routing of static contents such as HTML documents
o   Configure the TCP/IP transport port of the WebSphere Application Server. For example, you can configure and route traffic directly to the TCP/IP transport port of the WebSphere Application Server
o   Design and implement HTTP session, for example, session persistency and session failover
o   Configure virtual host
o   Troubleshooting HTTP related problems, for example, pinging a HTTP port to see if the web server or the application server’s HTTP port is functional and responsive to requests
-->
-->
HTTPS (TyperText Transfer Protocol Secure)


I started to learn and use Internet security when I was developing Java applications for the now defunct Enron Corporation in Houston as an IBM technical consultant. Enron was trying to sell utilities such as gas and electricity through the Internet to residential and small business by using Internet technologies. One of the technical challenges was to secure the contracts sent to the customer. First, such document should be encrypted against interception. Secondly, the customer should not be able to modify the document. At the first glance, the security task looked intimidating. However, after some time working on Internet and Java security, I successfully used HTTPS to encrypt the transmission and bought a software component from an Australian company to ensure that the customer could not use PDF editor to modify the contract sent to him or her through Internet.

HTTPS is a secure transport

HTTPS is secure HTTP transport. The transmission of data over the Internet is encrypted. HTTPS uses keys to encrypt and descript data transmitted over the network. For encryption, HTTPS depends on Secure Socket Layer (SSL) that is now officially Transport Layer Security (TLS).  
 
TLS or SSL

To discuss TLS or SSL, you first should learn asymmetrical cryptography and symmetrical cryptography. The difference between asymmetrical cryptography and symmetrical cryptography is the key used to encrypt and decrypt the data sent over the network.

For symmetrical cryptography, you can assume the key used to encrypt and decrypt data sent via the network is the same. For example, you can use the same key to decrypt a text that has been encrypted by the same key. It is difficult to use symmetrical cryptography on the Internet, especially for the initial key sharing. Asymmetrical cryptography overcomes this limited by a public key and a private key architecture.

TLS or SLL handshake between the server and the client can appear quite involved There are a number of components involved - the server’s public key, the server’s security certificate, cyphers and hash functions etc. Do not be intimidated.  All you need is patience to work through the process. There is an abundance of simple and easy technical explanation to help you to learn, for example, the TLS article by Wikipedia.

The key of TLS is the server’s sharing of public key. The client uses the public key to encrypt the initial exchange to form a session key. The server uses its private key to decrypt the message sent by the client. From then on, ta session key is formed and works as the symmetrical key for security communication.

Please have a look at the following steps (Digital certificate is discussed in the next paragraph).

·       The handshake begins when a client connects to a TLS-enabled server requesting a secure connection and presents a list of supported cipher suites (ciphers and hash functions).
·       From this list, the server picks the strongest cipher and hash function that it also supports and notifies the client of the decision.
·       The server sends back its identification in the form of a digital certificate. The certificate usually contains the server name, the trusted certificate authority (CA) and the server's public encryption key.
·       The client may contact the server that issued the certificate (the trusted CA as above) and confirm the validity of the certificate before proceeding.
·       In order to generate the session keys used for the secure connection, the client encrypts a random number with the server's public key and sends the result to the server. Only the server should be able to decrypt it, with its private key.
·       From the random number, both parties generate key material for encryption and decryption.

Digital Certificate

Your driver’s license can function as a certificate to your identification. Digital certificate, public key certificate, or identity certificate all refer to an electronic document that serves as the proof of the identification of a web site. Your driver’s license may come from your local government. A digital certificate for a web site usually is issued from a Certificate Authority or CA such as VeriSign. There is a variety of digital certificate with different class and different level of protection. For example, the digital certificate used by the Bank of America online banking is a 128-bit certificate that provides strong encryption protection. Depending on the browser that you use, you can see in the address bar of browser having “Bank of America” in green background either on the right side or left side of the address bar. Click on “Bank of America” with green background, you will see details of the 128-bt digital certificate. Examine the certificate and compare Bank of America’s certificate with other large banks and note any difference.

HTTPS port

Often, an IP address uniquely identifies a computer and a port number a unique process on that computer. Therefore, an IP address and a port number constitute a unique end point in network communication.

Here is a list of well-known ports. I was asked about these ports at a job interview. At least, remember that HTTP uses port 80 and HTTPS uses port 443. For a comprehensive listing, check out this Wikipedia article

·       20 & 21: File Transfer Protocol (FTP)
·       22: Secure Shell (SSH)
·       23: Telnet remote login service
·       25: Simple Mail Transfer Protocol (SMTP)
·       53: Domain Name System (DNS) service
·       80: Hypertext Transfer Protocol (HTTP) used in the World Wide Web
·       110: Post Office Protocol (POP3)
·       443: HTTP Secure (HTTPS)

What a WebSphere engineer does with HTTPS

A WebSphere engineer installs and configure the digital certificate, troubleshoot the certificate and HTTPS related issues, as well as ensure that the digital certificate gets updated before it expires.  All these are important work. For example, if your certificate expires before you update the certificate, your production servers may be impacted. However, WebSphere infrastructure design is the main area of work that a WebSphere engineer needs to use his HTTPS knowledge and skills. The following are some of the topics during the design phase of a project.

·      Consulting work on whether to use HTTPS – HTTPS secures network communication with a performance cost due to the resources needed for encryption and decryption
·      Where to terminate HTTS – within the Intranet of a corporation, there may not be a need for HTTPS and the extra cost. Therefore, the HTTPS traffic comes from the Internet may be terminated at the local load balancer layer
·      Decisions on offloading the HTTPS encryption and decryption work form the server in using special equipment or appliances such as dedicated encryption card or IBM Data Power appliance due to the CPU intensive nature of encryption
·      Helping the team to identify certificate to use and the Certificate Authority to use

-->

Geographical load balancer


Geographical load balancer is also called global load balancer. It is a specialized computer or a network device that distributes traffic among data centers. Companies that have multiple production data centers usually have 1+N redundancy to achieve high availability and resiliency for critical enterprise applications. Geographical load balancers are deployed at each data center to distribute the load among the clusters of redundant servers. To better understand how a geographical load balancer work, we have to have a look at Domain Name System or DNS.

DNS


Many years back, I got an interesting email. The message was about a beautiful, young, female PhD candidate, a total stranger, reaching out to me for friendship. There was a phone number in the message and I was encouraged to call. I did call and the young lady, a young PhD candidate, did answer my call and told me that I was one among many who called. Someone sent out hundreds of emails in her name and gave out her phone number.

I had a close look at the email header and found the IP address of the computer where the email came from. I looked up the domain name by using the IP address and found the domain name. I found the domain name belongs to a university. I called the university and asked for help to locate the computer. The university IT found that the computer belonged to one of its computer labs. Searching through the lab’s logbook, they identified a male student who used the computer when these messages were sent out from the computer. The authority was notified. The FBI came and took the keyboard, mouse, and the computer as evidence.

Yes, you can get to the domain name from the IP address. The opposite is true as well. You can also, more often, get to the IP address from the domain name. Either way, Domain Name System is involved. A DNS is a primarily a hierarchical distributed naming system for computers. Have a look at the following.

1.     A web browser contacts a DNS server with a domain name for IP address
2.     If the DNS server does not have this name, it contacts its “parent” DNS server
3.     The “parent” DNS server found the domain name and its IP address
4.     The IP address is returned to the “child” DNS server
5.     The IP address is relayed to the browser by the “child” DNS server
6.     The browser uses the IP address returned to contact the host computer  

How Geographical Load Balancer works



Geographical load balancer such as F5 Global Traffic Manager (GTM) is deployed at each data center that is being load balanced. For example, if you have three data centers, you will have three geographical load balancer installed – one each data center. However, only one geographical load balancer is playing the Primary DNS role. When certain browser traffic comes, first it goes to the geographical load balancer that serves as the primary DNS. Then, the following usually happens.


1.     The primary DNS sends a query to each geographical load balancer
2.     The responses are collected and a “best fit” candidate is determined
3.     The primary DNS returns the IP address of the “best fit” device
4.     The browser uses the IP address returned to contact the device at the right data center

What a WebSphere engineer do with geographical load balancer


Very few IT organizations have their middleware team manage geographical load balancer. However, a WebSphere engineer is better equipped with high-level knowledge of geographical load balancer, especially in designing middleware infrastructure and in isolating an infrastructure problem.

To be continued -

For further reference, I have one chapter in my WebSphere engineering book that is dedicated to technical training, hiring considerations, and the technical skills needed to be a good WebSphere system administrator.

5 comments:

Gopinathan Munappy said...

Dear Ying Ding
Just a curiosity, Did you publish How to Learn WebSphere in 31 Days-Part5?

Ying Ding said...

Not yet, but we intend to finish all parts. This is a after-work project and we appreciate your patience.

Anonymous said...

Dear Ying Ding,

Thanks for the great article regarding WAS, really it is very informative and useful. And we are egerly waiting for Part-5.

Anonymous said...

Hi Mr. Ding,

I'm currently employed at IBM supporting WebSphere Middleware and while digging through your background, I didn't know that you were an IBM consultant for many years. I really enjoyed reading this blog and digesting all the information you write about WebSphere. I look forward to Part 5 - Jython and WebSphere Automation. Don't keep us eager, avid reader waiting too long.

Anonymous said...

Hi Ying,

I'm also an employee at IBM working at one of the Delivery Centers in NA as a Middleware tech. I truly enjoyed reading all your articles. Keep your knowledge transfer coming.