4. Running GemStone

Previous chapter

Next chapter

This chapter shows you how to perform some common GemStone/S 64 Bit system operations:

Starting the GemStone Server

Starting a NetLDI

Listing Running Servers

Starting a GemStone Session

Identifying Logged-in Sessions

Shutting Down the Object Server and NetLDI

Recovering from an Unexpected Shutdown

Bulk-Loading Objects

Managing Large Repositories

4.1  Starting the GemStone Server

In order to start a Stone repository monitor, the following must be identified through your operating system environment:

The GEMSTONE environment variable must point to the directory where GemStone is installed, such as /users/gemstone. The directory $GEMSTONE/bin should be in your search path for commands.

The repository monitor must find a configuration file. The default is $GEMSTONE/data/system.conf. Other files can supplement or replace the default file; for information, see How GemStone Uses Configuration Files.

The configuration file must give the path to one or more repository files (extents) and to space for transaction logs. The default configuration file specifies $GEMSTONE/data/extent0.dbf for the extent file, and places transaction logs in $GEMSTONE/data/. You may want to move these files to other locations. For further information, see Choosing the Extent Location.

To Start GemStone

Follow these steps to start GemStone following installation or an orderly shutdown. (To recover from an abnormal shutdown, refer to Recovering from an Unexpected Shutdown.)

Step 1. Set the GEMSTONE environment variable to the full pathname (starting with a slash) of the directory where GemStone is installed. Ordinarily this directory has a name like GemStone64Bit3.2.0-x86_64.Linux (depending on your platform). For example:

$ GEMSTONE=/users/GemStone64Bit3.2.0-x86_64.Linux
$
 export GEMSTONE

If you have been using another version of GemStone, be sure you update or unset previous settings of these environment variables:

  • GEMSTONE
  • GEMSTONE_SYS_CONF
  • GEMSTONE_EXE_CONF
  • GEMSTONE_NRS_ALL

Step 2. Set your UNIX path. One way to do this is to use one of the gemsetup scripts. There is one version for users of the Bourne and Korn shells and another for users of the C shell. These scripts also set your man page path to include the GemStone man pages. Note that these scripts append to the end of your path or man path; you will need to manually remove references to older versions of GemStone.

(Bourne or Korn shell)
$ . $GEMSTONE/bin/gemsetup.sh

or (C shell)
% source $GEMSTONE/bin/gemsetup.csh

Step 3. Start GemStone by using the startstone command:

% startstone [gemStoneName]

where gemStoneName is optional and is the name you want the repository monitor to have. The default name is gs64stone. See startstone for additional information.

To Troubleshoot Stone Startup Failures

If the Stone repository monitor fails to start in response to a startstone command, it’s likely that the cause is one of the following. Inspect the Stone log for clues (the default location is $GEMSTONE/data/gs64stone.log).

  • The GemStone key file is missing or invalid.
  • The shared page cache cannot be attached.
  • A problem with the extents: a missing extent or one that is in use by another process
  • A problem with transaction logs: a log needed for recovery is missing, or the log directory or device does not exist
  • The repository has become corrupted

Missing or Invalid Key File

The Stone repository monitor must be able to read the GemStone key file. This is by default in the filename and at the file path $GEMSTONE/sys/gemstone.key, but the location and filename can be configured by the KEYFILE configuration parameter.

Ordinarily, you create this file during installation from information provided by GemStone. Be careful to enter the information correctly. GemStone key files are platform-specific, and key files for earlier versions may not work with new releases.

If you do not have a valid key file, contact GemStone Technical Support as described under Technical Support.

Shared Page Cache Cannot Be Attached

The shared page cache monitor must be able to create and attach to the shared memory segment that will serve as the shared page cache. Several factors may prevent this from happening:

  • On some platforms, shared memory is not enabled in the kernel by default, or its default maximum size is too small to accommodate the GemStone configuration. GemStone’s default configuration requires a shared memory segment somewhat larger than 75 MB. For specifics about configuring shared memory, refer to the GemStone/S 64 Bit Installation Guide for your platform.
  • If the size of the shared page cache has been increased, the operating system’s limit on shared memory regions may need to be increased accordingly. GemStone includes a utility, $GEMSTONE/install/shmem ,that will help you check the configuration; this is described here.
  • The repository executables (the Stone, Gems, and page servers) must have permission to read and write the shared page cache. Ways to set up access are described in To Set File Permissions for the Server. In general, users must belong to the same group as the Stone repository monitor. If the Stone is running as root, it is unlikely that other users will be able to access the shared page cache.

Extent Missing or Access Denied

If the Stone repository monitor cannot access a repository extent file, it logs a message like the following:

GemStone is unable to open the file
!TCP@pelican#dbf!/users/GemStone/data/extent0.dbf.
reason = File = /users/GemStone/data/extent0.dbf
DBF Op: Open; DBF Record: -1;
Error: open() failure; System Codes: errno=2, ENOENT, The file or
directory specified cannot be found
 
An error occurred opening the repository for exclusive access.
 
Stone startup has failed.

Examine the message for further clues. The extent file could be missing, the permissions on the file or directory could be set incorrectly, or there may be an error in the configuration file that points to the extents. Correct the problem, then try starting GemStone again.

Extent Open by Another Process

If another process has an extent file open when you attempt to restart GemStone, a message like the following appears in the Stone log (by default, $GEMSTONE/data/gs64stone.log):

GemStone is unable to open the file
!TCP@pelican#dbf!/users/GemStone/data/extent0.dbf.
reason = File = /users/GemStone/data/extent0.dbf
DBF Op: Open; DBF Record: -1; Error: exclusive open:  File is open
by another process.; System Codes:
errno=11, EAGAIN, No more processes (due to process table full, 
user quotas, or insufficient memory)
 
An error occurred opening the repository for exclusive access.
 
Stone startup has failed.

Close any other Gem sessions (including Topaz sessions) that are accessing the repository you are trying to restart, or wait for a copydbf to complete. Use ps -ef (the options on your system may differ) to identify any pgsvrmain processes that are still running, and then use kill processid to terminate them. Try again to start GemStone.

Extent Already Exists

If GemStone attempts to recover from a system crash that occurred just after an extent was created, and GemStone was not able to write a checkpoint when the extent was added, you will find an error message like the following in the Stone log:

An error occurred in recovery for extentId 2:
fileName= !TCP@pelican#dbf!/users/GemStone/data/extent1.dbf
File already exists; you must delete it before recovery can succeed.

Check that an extent was being added to the repository at or shortly before the crash. If necessary, look for a message near the end of the Stone log file.

  • If an extent was being added, there is no committed data in the extent file yet. Delete the specified file and do not replace it with anything. Try to start GemStone again. The recovery procedure will recreate the extent file.
  • If an extent was NOT being added, it is possible that an existing extent has been corrupted. For instance, extent0.dbf of a multiple-extent repository may have been overwritten. Try to determine the cause and whether the action can be rectified. You may have to restore the repository from a backup.

Other Extent Failures

At startup, the GemStone system performs consistency checks on each extent listed in DBF_EXTENT_NAMES.

All extents must have been shut down cleanly with a repository checkpoint the last time the system was run. This consistency check is the only one for which GemStone attempts automatic recovery.

The following consistency checks, if failed, cause the startup sequence to terminate. These failures imply corruption of the disk or file system, or that the extents were modified at the operating system level (such as by cp or copydbf) outside of GemStone’s control and in a manner that has corrupted the repository.

  • Extents must be in proper sequence within DBF_EXTENT_NAMES.
  • Extents must be properly sequenced in time.
  • The last checkpoint must have occurred earlier than or at the same time as the current system time (in GMT).
  • Extents must belong to the correct repository.

Transaction Log Missing

If GemStone cannot find the transaction log file for the period between the last checkpoint and an unexpected shutdown, it puts a message like this in the Stone log:

Extent 0 was not cleanly shutdown; recovery is needed.
<Repository startup statistics>
 
Repository startup is from checkpoint = (fileId 6, blockId 3)
 
ERROR: cannot find log file(s) to recover repository.
To proceed without tranlogs and lose transactions committed
since the last checkpoint use "-N" switch on your startstone
command.
 
An error occurred when attempting to start repository recovery.
Waiting for aiowrites to complete
 
Stone startup has failed.

If the log file was archived and removed from the log directory, restore the file.

If the log file is no longer available, you can use startstone -N to restart from the most recent checkpoint in the repository. However, any transactions that occurred during the intervening period cannot be recovered.

NOTE
When you use startstone with the -N option, any transactions occurring after the last checkpoint are permanently lost.

Other Startup Failures

  • Check /opt/gemstone/locks (or equivalent location, as discussed here) and remove old files. On Solaris systems, also check /tmp/gemstone for stoneName..FIFO.
  • Certain unexpected shutdowns may leave UNIX interprocess communication facilities allocated, which can block attempts to restart the repository monitor. Use the command ipcs to identify the shared memory segments and semaphores allocated, then use ipcrm to free those resources allocated to a repository monitor that is no longer running. For information about ipcs and ipcrm, consult your operating system’s documentation.
  • If it takes more than 5 minutes for your cache to complete initialization, the startup timeout may be expiring. Set the environment variable $GEMSTONE_SPCMON_STARTUP_TIMELIMIT.
  • Check your installation configuration and make sure that all required files and libraries are present and uncorrupted.
  • Try to run pageaudit on the repository. (See Repository Page and Object Audit.)

If you are still unable to start GemStone or determine the reason that startup is failing, contact your local GemStone administrator or GemStone Technical Support.

If this is an existing GemStone repository and the problems reported on startup attempts indicate that the repository is corrupt, you may need to restore from backups, as described in Chapter 9. See “How to Restore from Backup.

4.2  Starting a NetLDI

You will usually need to start a GemStone Network Long Distance Information (NetLDI) server when starting a Stone repository monitor. NetLDI servers are needed to start up Gem processes for RPC logins, and for starting up caches on behalf of Gems that are on other nodes.

If you are running distributed configurations, you will need to perform these steps on each node that requires a NetLDI.

To start a NetLDI server, perform the following steps on the node where the NetLDI is to run:

Step 4. Set the GEMSTONE environment variable to the full pathname (starting with a slash) of the directory where GemStone is installed. Ordinarily this directory has a name like GemStone64Bit3.2.0-x86_64.Linux (depending on the platform). For example:

$ GEMSTONE=/installDir/GemStone64Bit3.2.0-x86_64.Linux
$ export GEMSTONE

If you have been using another version of GemStone, be sure you update or unset previous settings of the $GEMSTONE_NRS_ALL environment variable

Step 5. Use one of the gemsetup scripts to set your UNIX path. There is one version for users of the Bourne and Korn shells and another for the C shell. These scripts also set your man page path to include the GemStone man pages.

(Bourne or Korn shell)
$ . $GEMSTONE/bin/gemsetup.sh

or (C shell)
% source $GEMSTONE/bin/gemsetup.csh

Step 6. Start the NetLDI by using the startnetldi command.

% startnetldi
% startnetldi -g -aname

For additional information about startnetldi, see the command description in Appendix B. For information about the authentication modes, see How To Arrange Network Security.

To Troubleshoot NetLDI Startup Failures

If the NetLDI service fails to start in response to a startnetldi command, it’s likely that the cause is one of the following:

  • The NetLDI is to run as root but the guest mode option is specified. This combination is not allowed.
  • The account starting the NetLDI does not have permission to create or append to its log file.
  • The account starting the NetLDI does not have read and execute permission for $GEMSTONE/sys/netldid.

Check the NetLDI log for clues. By default, the NetLDI log (netLdiName.log) is located in /opt/gemstone/log. On some systems, this file may be located in /usr/gemstone/log, and may be overridden using the -l option to the startnetldi command, or by setting $GEMSTONE_GLOBAL_DIR.

4.3  Listing Running Servers

The gslist utility lists all Stone repository monitors, shared page cache monitors, and NetLDIs that are running. The gslist command by itself checks the locks directory (/opt/gemstone/locks, /usr/gemstone/locks, or $GEMSTONE_GLOBAL_DIR/locks) for entries. The -v option causes it to verify that each process is alive and responding. For example:

% gslist -v
Status Version Owner     Started      Type   Name
------ ------- --------- ------------ ------ ----
 OK   3.2.0    gsadmin   Mar 11 12:02 cache  gs64stone~1c9fa07f0412665
 OK   3.2.0    gsadmin   Mar 11 12:02 Stone  gs64stone
 OK   3.2.0    gsadmin   Mar 11 10:13 Netldi gs64ldi
 

By default, gslist lists servers on the local node. The -m host option performs the operation on node host, which must have a compatible NetLDI running.

4.4  Starting a GemStone Session

This section tells how to start a GemStone session and log in to the repository monitor. The instructions apply to all logins from the node on which the Stone repository monitor is running.

This section begins with a brief discussion of environmental variables, and then presents two examples. The first example starts a linked application and logs in to GemStone. The second example starts an RPC application, which in turn spawns a separate Gem session process that communicates with the GemStone server.

The examples use Topaz as the application because it is part of the standard GemStone Object Server distribution. Other applications may use different steps to accomplish the same purpose. Some users may prefer to make these steps part of an initialization file.

For an explanation of the difference between linked and RPC sessions, see Linked and RPC Applications.

To Define a GemStone Session Environment

In order to start a GemStone session, the following must be defined through your operating system environment:

  • Where GemStone is installed

All GemStone users must have a GEMSTONE environment variable that points to the GemStone installation directory, such as
/installDir/GemStone64Bit3.2.0-x86_64.Linux (depending on your platform). The directory $GEMSTONE/bin should be in your search path for commands. For an example, see the next topic, “To Start a Linked Session”.

  • Which configuration parameters to use

Because each GemStone session can have its own configuration file, some users may need a second environmental variable, such as GEMSTONE_EXE_CONF. If no other file is found, the session uses system defaults. For further information, see How GemStone Uses Configuration Files.

To Start a Linked Session

The following steps show how to start a linked application (here, the linked version of Topaz). The steps for setting the GEMSTONE environment variable and the operating system path for a session are the same as those given here for starting a repository monitor. They are repeated here for convenience.

The procedure assumes that the Stone repository monitor has already been started and has the default name gs64stone.

Step 1. Set the GEMSTONE environment variable to the full pathname (starting with a slash) of the directory where GemStone is installed. Ordinarily this directory has a name like GemStone64Bit3.2.0-x86_64.Linux (depending on your platform). For example:

$ GEMSTONE=/installDir/GemStone64Bit3.2.0-x86_64.Linux
$
 export GEMSTONE

If you have been using another version on GemStone, be sure you update or delete previous settings of these environment variables:

  • GEMSTONE
  • GEMSTONE_SYS_CONF
  • GEMSTONE_EXE_CONF
  • GEMSTONE_NRS_ALL

Step 2. Set your UNIX path. One way to do this is to use one of the gemsetup scripts. There is one version for users of the Bourne and Korn shells and another for users of the C shell. These scripts also set your man page path to include the GemStone man pages.

(Bourne or Korn shell)
$ . $GEMSTONE/bin/gemsetup.sh

or (C shell)
% source $GEMSTONE/bin/gemsetup.csh

Step 3. Start linked Topaz:

% topaz -l

Step 4. Set the UserName login parameter:

topaz> set username DataCurator

Step 5. Log in to the Gem session. It will query you for the password.

topaz> login
GemStone Password?
[Info]: LNK client/gem GCI levels = 860/860
[Info]: libssl-3.2.0-64.so: loaded
[Info]: User ID: DataCurator
[Info]: Repository: gs64stone
[Info]: Session ID: 6
[Info]: GCI Client Host: <Linked>
[Info]: Page server PID: -1
[Info]: Login Time: 03/14/2014 11:36:47.508 PDT
Gave this VM preference for OOM killer, Wrote to /proc/6923/oom_adj value 4
[03/14/2014 11:36:47.510 PDT]
  gci login: currSession 1 linked session
successful login
topaz 1> 

At this point, you are logged in to a Gem session process, which is linked with the application. The session process acts as a server to Topaz and as a client to the Stone. Information about Topaz is in the manual GemStone Topaz Programming Environment.

When you are ready to end the GemStone session, you can log out of GemStone and exit Topaz in one step by invoking the Topaz exit command:

topaz 1> exit

To Start an RPC Session

The following steps show how to start an RPC application (here, the RPC version of Topaz) on the server node. The procedure assumes that the Stone is running under the default name gs64stone and that you are already set up to run a GemStone session as described in Step 1 and Step 2 of the previous example (“To Start a Linked Session”).

Sessions that login RPC use SRP (Secure Remote Password) and SSL to authenticate passwords for login. If the Gem is running on the server node, the connection reverts to normal socket communication after login completes.

The following steps demonstrate an RPC login from topaz:

Step 1. Use gslist to find out if a NetLDI is already running. The default name for the NetLDI is gs64ldi.

% gslist
Status Version  Owner      Started    Type  Name
------ -------- --------- ------------ ------ ----
exists 3.2.0    gsadmin   Mar 11 12:02 cache gs64stone~1c9fa07f0412665
exists 3.2.0    gsadmin   Mar 11 12:02 Stone  gs64stone
exists 3.2.0    gsadmin   Mar 11 10:13 Netldi gs64ldi
 

If necessary, start a NetLDI following the instructions under Starting a NetLDI.

Step 2. Unless the NetLDI is running in guest mode with a captive account, set the application login parameters, such as HostUserName and HostPassword, after you start the application. For example:

topaz> set hostusername yourUnixId
topaz> set hostpassword yourPassword

Step 3. Start the RPC application (such as Topaz), then set the UserName.

topaz> set username DataCurator

Step 4. Set GemNetId (the name of the Gem service to be started) to gemnetobject. This script starts the separate Gem session process for you. For example:

topaz> set gemnetid gemnetobject

Step 5. Log in to the GemStone session.

topaz> login
GemStone Password?
[Info]: libssl-3.2.0-64.so: loaded
[03/14/2014 11:36:47.777 PDT]
  gci login: currSession 1 rpc gem processId 6943
successful login
topaz 1> 

At this point, you are logged in through a separate Gem session process that acts as a server to Topaz RPC and as a client to the Stone repository monitor.

When you are ready to end the GemStone session, you can log out of GemStone and exit Topaz by in one step by invoking the Topaz exit command:

topaz 1> exit

To Troubleshoot Session Login Failures

Several factors may prevent successful login to the repository:

  • Your GemStone key file may establish a maximum number of user sessions that can simultaneously be logged in to GemStone. (Note that a single user may have multiple GemStone sessions running simultaneously.) The limit itself is encoded in the keyfile used to start the stone (by default, $GEMSTONE/sys/gemstone.key), and reported in the stone log on startup. By default, the Stone log file is $GEMSTONE/data/gemStoneName.log. Look for a line like this:
SESSION MAX: The licensed concurrent session max is 10.
  • The STN_MAX_SESSIONS configuration option can restrict the number of logins to fewer than a particular key file allows. An entry in the Stone log file shows the maximum at the time the Stone started. By default, the Stone log file is $GEMSTONE/data/gemStoneName.log. Look for a line like this:
SESSION CONFIGURATION: The maximum number of concurrent sessions is 41
  • The SHR_PAGE_CACHE_NUM_PROCS configuration option restricts the number of sessions that can attach to a particular shared page cache. This number can be different on each node, depending on the configuration file that is read by the process that starts the cache. On the node where the Stone runs, one of this number is used by the Stone, the shared page cache monitor, each GcGem (garbage collection) session, each Stone AIO page server, the page manager, the SymbolGem, and each free frame page server. On other nodes, the Stone’s page server and the shared page cache monitor each use one. For details, see To Set the Page Cache Options and the Number of Sessions. Check the Stone’s log for warnings that the value requested for SHR_PAGE_CACHE_NUM_PROCS has been adjusted to match your system’s configuration.
  • The UNIX kernel must provide two semaphores for each session that wants to attach to the shared page cache. See Reviewing Kernel Tunable Parameters.
  • The UNIX kernel file descriptor limit can restrict the number of sessions, and GemStone executables attempt to raise that limit. For information, see the discussions under Estimating Server File Descriptor Needs and Estimating Session File Descriptor Needs. On some operating systems, you can examine the kernel limit by invoking limit.
  • The owner of the Gem or a linked application process must have write access to the extent file and to the shared page cache. Use the UNIX command ipcs -m to display permissions, owner, and group for shared memory. For example:
server% ipcs -m
IPC status from <running system> as of Mon March 10 16:21:08 PDT 2014
T      ID      KEY         MODE         OWNER      GROUP
Shared Memory:
m      25089   0x4c000ed5  --rw-rw----  gsadmin    users

Typical problems occur with linked applications, which may be installed without the S bit and therefore rely on group access to the shared page cache and the repository.

4.5  Identifying Logged-in Sessions

Privileges required: SessionAccess.

To identify the sessions currently logged in to GemStone, send the message System class>>currentSessionNames. This message returns an array of internal session numbers and the corresponding UserId. For example:

topaz 1> printit
System currentSessionNames 
%
session number: 2    UserId: GcUser
session number: 3    UserId: GcUser
session number: 4    UserId: SymbolUser
session number: 5    UserId: DataCurator

The session number can be used with other System class methods to stop a particular session or to obtain its UserProfile. See stopSession:aSessionId, terminateSession:aSessionId timeout:seconds and userProfileForSession:aSessionId.

NOTE
Be aware that it may take as long as a minute for a session to terminate after you send stopSession:.If the Gem is responsive, it usually terminates within milliseconds. However, if a Gem is not active (for example, sleeping or waiting on I/O), the Stone waits one minute for it to respond before forcibly logging it out. You can bypass this timeout by sending terminateSession:timeout:

The method System class>>descriptionOfSession:aSessionId returns an array of descriptive information by which you can trace the session name to a particular person: the second element shows the operating system process id (pid), and the third element shows the name of the node on which it is running. In this example, the DataCurator session is running on “node1” as pid 3010:

topaz 1> printit
System descriptionOfSession: 5
%
an Array
	#1 an UserProfile
	#2 3010
	#3 node1
	...

For details about these methods and the information returned, see the class and method comments in the image.

4.6  Shutting Down the Object Server and NetLDI

Privileges required: SystemAccess and SystemControl.

To shut down GemStone from UNIX, first make sure that all user sessions have logged out. One way to find out about other user sessions is to send the message currentSessionNames to System. For example, using Topaz:

topaz 1> printit
System currentSessionNames 
%
session number: 2    UserId: GcUser
session number: 3    UserId: GcUser
session number: 4    UserId: SymbolUser
session number: 5    UserId: DataCurator

The SymbolUser and GcUser sessions are system session and will be shut down cleanly when the stone is shut down. The above example includes session 5, which is the user executing the example code.

After all user sessions have logged out, use the stopstone command, which performs an orderly shutdown in which all committed transactions are written to the extent files.

% stopstone [gemStoneName] [-i]

If you do not supply the name of the Stone repository monitor, stopstone prompts you for one. The default name during startup was gs64stone. If necessary, use gslist to find the name.

The -i option aborts all current (uncommitted) transactions and terminates all active user sessions. If you do not specify this option and other sessions are logged in, GemStone will not shut down and you will receive a message to that effect.

stopstone prompts you to supply a GemStone username and password. The user must have the SystemControl privilege (initially, this privilege is granted to SystemUser and DataCurator). For details about user accounts and privileges, see Chapter 6.

There is a similar command to shut down the NetLDI network service.

% stopnetldi [netLdiName]

For more information, see the command reference in Appendix B; stopstone and stopnetldi.

If you are logged in to a GemStone session, you can invoke System class>>shutDown, which also requires the SystemControl privilege.

CAUTION
If you must halt a specific Gem session process or GemStone server processes, be sure to use only kill or kill -term so that the Gem can perform an orderly shutdown.

Do NOT use kill -9 or another uncatchable signal, which may not result in a clean shutdown or may cause the Stone repository monitor to shut down when you intended to kill only a Gem process. If for some reason you need to send kill -9 to a shared page cache monitor, use ipcs and ipcrm to identify and free the shared memory and semaphore resources for that cache. If you send kill -9 to a Stone, use ipcs to determine whether ipcrm should be invoked.

4.7  Recovering from an Unexpected Shutdown

GemStone is designed to shut down in response to certain error conditions as a way of minimizing damage to the repository. If GemStone stops unexpectedly, it probably means that one of the following situations has occurred:

When GemStone shuts down unexpectedly, check the message at the end of the Stone log file to begin diagnosing the problem. Unless you have set up an environment variable, or specified another file on the startstone command line, the Stone log is $GEMSTONE/data/gemStoneName.log.

The $GEMSTONE/data directory also contains log files for the Stone child processes. The child processes have log names formed from gemStoneName, the process id, and a descriptive abbreviation. For instance:

gs64stone.log

Stone repository monitor

gs64stone_14033admingcgem.log

Admin GemAdmin Gem

gs64stone_2963pcmon.log

Shared page cache monitor

gs64stone_2967pgsvrff.log

Free Frame page server

gs64stone_2984pgsvraio.log

AIO page server

gs64stone_2987pagemanager.log

Page Manager

gs64stone_2992reclaimgcgem.log

Reclaim GemReclaim Gem

gs64stone_2994symbolgem.log

SymbolGem

Once the problem is identified, your recovery strategy should take into account the interdependence of GemStone system components. For instance, if an extent becomes unavailable, to restart the system and recover you may have to kill the Stone repository monitor if it is still running. The stopstone command won’t work in this situation, since the orderly shutdown process requires the Stone to clean up the repository before it stops.

Normal Shutdown Message

If you see a shutdown message in the system log file, GemStone has stopped in response to a stopstone command or a Smalltalk System shutdown method:

--- 03/21/14 13:00:43 PDT ---
LoginsSuspended is set to 1 by DataCurator from session 5
 
SHUTDOWN command was received from user DataCurator session 5 gem processId 29188.
Waiting for aiowrites to complete
Waiting for NetRead thread to stop
 
Now stopping GemStone.

After a normal shutdown, restart GemStone in the usual manner. For instructions, see Starting the GemStone Server of this chapter.

Disk Failure or File System Corruption

GemStone prints several different disk read error messages to the GemStone log file. For example:

Repository Read failure,
fileName = !#dbf!/users/gemstone/data/extent0.dbf
PageId = 94
File = /users/gs64stone/data/extent0.dbf
too few bytes returned from read()
DBF Operation Read; DBF record 94, UNIX codes: errno=34,...
	"A read error occurred when accessing the repository."

If you see a message similar to the above, or if your system administrator identifies a disk failure or a corrupted file system, try to copy your extents to another node or back them up immediately. The copies may be bad, but it is worth doing, just in case. If you’re lucky, you may be able to copy them back after the underlying problem is solved and start again with the current committed state of your repository.

Otherwise, you may need to restore the repository. For details, see the restore procedures in Chapter 9.

Shared Page Cache Error

If you find a message similar to the following in the GemStone log, the shared page cache (SPC) monitor process (shrpcmonitor) died. The SPC monitor log, $GEMSTONE/data/gemStoneName_pcmonnnnn.log,may indicate the reason.

--- 04/04/13 15:07:19 PDT ---
    The stone’s connection to the local shared cache monitor was lost.
    Error Text: ’Network partner has disconnected.’

The unexpected shutdown of a Gem process may, in rare cases, result in a “stuck spin lock” error that brings down the shared page cache monitor and the Stone. GemStone uses spin locks to coordinate access to critical structures within the cache. In most cases, the monitor can recover if a Gem dies while holding a spin lock, but not all spin locks can be recovered safely. Stuck spin locks may result from a Gem crash, but a typical cause is the use of kill -9 to kill an unwanted Gem process. If you must halt a Gem process, be sure to use only kill or kill -TERM so that the Gem can perform an orderly shutdown.

Use startstone to restart GemStone. For instructions, see Starting the GemStone Server.

Fatal Error Detected by a Gem

If a Gem session process detects a fatal error that would cause it to halt and dump a core image, the Stone repository monitor may do the same when it is notified of the event. This response on the part of the Stone is configurable through the STN_HALT_ON_FATAL_ERR configuration option. When that option is set to True and a Gem encounters a fatal error, the Stone prints a message like this in its log file:

Fatal Internal Error condition in Gem
   when halt on fatal error was specified in the config file

By default, STN_HALT_ON_FATAL_ERR is set to False. That setting causes the Stone to attempt to keep running if a Gem encounters a fatal error; it is the recommended setting for GemStone in a production system. You can set STN_HALT_ON_FATAL_ERR to True during development and testing to provide additional checks for potential risks.

Some Other Shutdown Message

In the event of other shutdown messages in the GemStone log:

1. Consider whether the shutdown might have been caused by a disk failure or a corrupt file system, especially if you see an unexpected message such as Object not found. If you suspect one of these conditions, start with a page audit of the repository file (see Repository Page and Object Audit).

If the page audit fails, refer to Disk Failure or File System Corruption, and consult your operating system administrator.

If the audit succeeds, continue to the next step.

2. If you don’t suspect disk failure or a corrupt file system, try using startstone to restart GemStone. For instructions, see Starting the GemStone Server.

3. If the restart fails, you may have to restore the repository. For details, see the restore procedures in Chapter 9.

No Shutdown Message

If the GemStone log doesn’t contain a shutdown message, there has probably been a power failure or an operating system crash. In that event, the Stone repository monitor automatically recovers committed transactions the next time it starts. Use startstone to restart GemStone, as described under Starting the GemStone Server. See startstone for more information on this command.

4.8  Bulk-Loading Objects

During bulk loading of objects into the repository, it may be desirable to make the following changes:

To address this concern, increase the GEM_TEMPOBJ_CACHE_SIZE configuration option. The size of each transaction (the number of 16 KB pages written) should be approximately 1/3 to 1/2 the size of GEM_TEMPOBJ_CACHE_SIZE, and no more than 1/4 to 1/2 the size of the shared page cache.

STN_TRAN_LOG_DIRECTORIES = /dev/null, /dev/null; 
STN_TRAN_FULL_LOGGING = TRUE;

For more information, see STN_TRAN_LOG_DIRECTORIES and STN_TRAN_FULL_LOGGING.

NOTE
Be aware that using /dev/null for the tranlog directories will prevent you from being able to restore tranlogs in the event of a system failure.

4.9  Managing Large Repositories

GemStone/S 64 allows you to define a very large shared page cache, thereby enabling you to run very large repositories. This section presents special considerations that apply to large repositories.

Loading the object table at startup

When starting the repository, the object table is not loaded into memory, and initial accesses can take an excessively long time. If you encounter degraded application performance for a period after restarting the Stone, you may want to start up using cache warming. There is an initial heavy I/O load at Stone startup with cache warming, but subsequent application performance should be consistent with your normal application performance. You may choose to preload just the object table pages, which are most important for performance. Alternatively, you can also preload data pages, which will improve performance if you have a large cache with a relatively fixed working set of data pages.

There are two ways to perform cache warming on startup.

Making efficient use of remote caches

When running a system on which many users log in simultaneously, consider using remote caches so that you don’t need to run all Gem processes on the same machine. There are a couple of ways to optimize this. The following configuration options are of particular interest:

  • To allow Gems to make more efficient use of the large cache, set the value of the GEM_PGSVR_FREE_FRAME_CACHE_SIZE configuration option to increase the size of the Gem free frame cache. For example:
GEM_PGSVR_FREE_FRAME_CACHE_SIZE = 25;
  • To improve performance on remote caches, set the value of the GEM_PGSVR_UPDATE_CACHE_ON_READ configuration option to True so that remote Gem sessions will update their local caches. For example:
GEM_PGSVR_UPDATE_CACHE_ON_READ = TRUE;

Disk Space and Commit Record Backlogs

Sessions only update their view of the repository when they commit or abort. The repository must keep a copy of each session’s view so long as the session is using it, even if other sessions frequently commit changes and create new views (commit records). Storing the original view and all the intermediate views uses up space in the repository, and can result in the repository running out of space. To avoid this problem, all sessions in a busy system should commit or abort regularly.

For a session that is not in a transaction, if the number of commit records exceeds the value of STN_CR_BACKLOG_THRESHOLD, the Stone repository monitor signals the session to abort by signaling TransactionBacklog (also called “sigAbort”). If the session does not abort, the Stone repository monitor reinitializes the session or terminates it, depending on the value of STN_GEM_LOSTOT_TIMEOUT.

Sessions that are in transaction are not subject to losing their view forcibly. Sessions in transaction enable receipt of the signal TransactionBacklog, and handle it appropriately, but it is optional. It is important that sessions do not stay in transaction for long periods in busy systems; this can result in the Stone running out of space and shutting down. However, sessions that run in automatic transaction mode are always in transaction; as soon as they commit or abort, they begin a new transaction. (For a discussion of automatic and manual transaction modes, see the “Transactions and Concurrency Control” chapter of the Programming Guide.)

To avoid running out of disk space, we recommend that you use manual transaction mode whenever possible. To enter manual transaction mode:

topaz> printit
System transactionMode: #manualBegin
%

At the point that this session needs to commit a change, begin a transaction manually, then make the changes:

topaz> printit
System beginTransaction .
AllUsers addNewUserWithId: #Jane password: 'gemstone' .
System commitTransaction
%

After you commit (or abort) the transaction, your session will return to waiting outside of a transaction.

Handling signals indicating a commit record backlog

Even in manual transaction mode, it is possible to cause a commit record backlog, depending on how your system is configured. Sessions should ensure that they commit or abort regularly, or set up sigAbort handlers to abort when requested by the Stone. A sigAbort handler may be as simple as this:

Example 4.1 sigAbort handler
Exception 
   installStaticException: 
   [ :exception :GSdictionary :errID :array |
      System abortTransaction.
      System enableSignaledAbortError).
 
 

Note that a session that is entirely idle does not become aware of the signal to abort, and may timeout and be terminated by the stone in spite of the handler. If your application may have idle sessions, we recommend setting up a timer that causes regular aborts when the session is otherwise idle.

Sessions that are in transaction, and therefore immune from the sigAbort mechanism, may also be signaled when there is a commit record backlog. When the number of commit records exceeds the value of STN_CR_BACKLOG_THRESHOLD, and the session holding the oldest commit record is in transaction, the Stone repository monitor signals the session by sending TransactionBacklog. The session then has the opportunity to perform a continueTransaction to update its view of unmodified objects. It may also commit or abort. Unlike sigAbort, the session can choose to ignore this message and will not receive further signals from the stone.

Example 4.2 finishTransaction handler
Exception 
   installStaticException: 
   [ :exception :GSdictionary :errID :array |
      System continueTransaction.
      System enableSignaledFinishTransactionError).
 
 

For more information on these signals, see the Programming Guide for GemStone/S 64 Bit.

Previous chapter

Next chapter