2. Bugs Fixed

Previous chapter

The following bugs have been fixed in this release.

AIO page server fatal errors during startup may leave Stone hanging

During startup, the sockets that connect the Stone and AIO page servers do not detect disconnects. If the AIO page servers encountered an error such as a write failure, it exited (correctly); but the Stone continued to wait for a response. (#44587)

Multithreaded scans now usable from remote Gems

In previous 3.x versions, the multi-threaded scans used by operations such as markForCollection could only be run multithreaded on Gems on the Stone's host. When executed from a remote Gem, these operations would run single-threaded. This restriction has been removed, and now remote Gems can run multithreaded scans. (#41178, #41177)

On first use of a remote host, linked logins may fail

Remote connections require that each host have a /opt/gemstone/locks/gemstone.hostid file to identify themselves. On a remote host, if this file is not found, it is created when a netldi starts up. However, if there has never been a netldi started, and the first GemStone process is a linked session which no longer requires a netldi, the login failed with a file not found error. (#45111)

Garbage Collection issues

Symbol Garbage Collection not usable

There were bugs in the primitive code supporting garbage collection of unused Symbols, as enabled by STN_SYMBOL_GC_ENABLED; it did not complete correctly with a non-trivial number of symbols or with corner cases of symbol references. Symbol GC was disabled in v3.2.7 and later. (#45403)

markGcCandidates issues

The internal phases in markGcCandidates were numbered incorrectly with respect to the object table sweep phase number, resulting in some phases sweeping pages that incorrectly include scavengeable pages. (#45227).

Since markGcCandidates is no longer of significant benefit, this function has been removed in v3.3.

Exception Handling issues

Exception handler #resume from primitive operation returned nil

If an exception handler was triggered from within low-level C code in a primitive, nil was incorrectly returned as the result of the Smalltalk method that called the primitive. (#44375)

Incorrect return from exception handler with non-local return in ensure: block

A non-local return within an ensure block did not correctly override the on:do: handling. (#45056)

For example, the following previously returned #ok, while the correct return should be 7.

[ [3 zork] ensure: [^7].
] on: Error do: [:ex | ex return: #ok]

Blocks may return wrong values when nested variable is assigned in outer scope

A block that is nested within another block can, in some cases, have the wrong value for a variable that is declared in an outer scope. An assignment to the variable performed in the outer scope is not always propagated to the inner block when it should be. (#44478).

For example, the following previously returned 1,while the correct result should be 100.

[ | temp myBlock |
temp := 1.
myBlock := [ temp ].
temp := 100.
myBlock value ] value

Unable to trap soft break in linked topaz

Soft breaks can normally be caught by a handler. While this worked in RPC topaz, in linked the soft break was treated like a hard break. (#43502)

#halt returned the result of resume, not receiver

Object halt returned the result of the signal, rather than the receiver. This created problems continuing after a halt. (#45126)

A do: loop on an Interval does not behave correctly with step through

When stepping through (using GBS or topaz step thru commands) a do: loop on an instance of Interval, on reaching the end of the block, the step through behaves as a continue. (#45320)

Some ANSI exception classes inherited error number

The classes CompileWarning, Deprecated, InterSessionSignal, TestFailure and ReusmableTestFailure inherited the error number from their superclass. (#43411)

Improvements in Error reporting

CompileError display

Compilation errors that occur in some block contexts; for example, In code sent to External sessions; did not display useful information, only a message such as ’a CompileError occurred (error 1001), compilation errors -- parameter 1 is error descriptor’. Now, the error text is provided. (#45626)

Improved error reporting for String and MultiByteString>>encodeUsing:

When the primitive failed, no useful information was provided. (#44140)

Improved error for attempt to store into internal class

Working with instance of an internal class (such as LargeObjectNode), the error returned was "Unable to store forward reference to object that is not allocated to this session", which is not clear; it now says "Attempt to store a reference to a hidden object". (#44232)

Error message referencing primitive -1/-2 does not exist is confusing

An error message saying "Primitive number -1 does not exist in the virtual machine" is incorrectly worded. These values are error indicators, with -1 meaning is does not exist, and -2 meaning it should not be used. The most likely cause is incompatible Stone image and executable versions. (#45692)

midLevelCacheConnect: error could return empty string rather than error

For certain kinds of errors from an attempt to connect to a mid-level cache using midLevelCacheConnect:* , the connection could fail, but the return value was an empty string rather a error message. (#45142)

OutOfRange errors in setting configuration values did not include details

The error returned when setting a runtime configuration parameter did not include details (#44107).

Improved error when using a name that matches a system service

Service names that require ports, such as gs64stone and gs64ldi, look for assigned port numbers in the OS services database. If a name is used that is already defined by the system, such as "auth", it will find this port, which may not be usable by the GemStone process. Now, the log files will indicate how the port was determined to distinguish this case. (#45268)

Kernel methods not respecting transient changes to symbol list

Some kernel methods referenced System myUserProfile symbolList, which did not respect any transient changes to the symbol list. This has been changed to use GsSession currentSession symbolList. (#45236)

Binary selectors /- and /-/ failed

These legal binary selectors were not compilable. (#45062)

Index and Query issues

GsQuery readStream with equality does not correctly handle cases where comparable objects are not equal

Since Strings and Symbols, for example, shared a superclass and can be ordered using <=, etc., it is allowed to mix instances of each together as objects that are in an index. However, a String and a Symbol with the same contents are not equal, in other words, a <= b and
a >= b does not mean a = b; so a query using = must only return instances of the appropriate class. GsQuery readStream incorrectly returned objects of any class in which a >= b and a <= b, rather than requiring a = b. (#44079)

removeAllIndexes could run out of memory

removeAllIndexes bypassed the collections and updated the dependency maps for each index element directly. On a large collection, this could result in AlmostOutOfMemory errors. Now, the DependencyMaps are put into a hidden set for processing, where they do not affect temporary object memory.

Index audit does not catch entries with equal keys, but OOPS out of order

An equality index creates a btree that may include many keys with equal values. There were conditions in which the btree entries were not ordered deterministically; this was not caught by the indexAudit code. (#44669).

IndexManager usageReport may get MNU #nextWord with Portable Streams installed

The code invoked by the IndexManager >> usageReport method calls PositionableStream >> nextWord, which was missing from portable streams (PositionableStreamPortable and subclasses). (#45807)

Linux network statistics incorrectly relative to start of statmonitor

The Linux network statistics InputPackets, InputKBytes, MulticastInputPackets, OutputPackets, and OutputKBytes were relative to when statmonitor started recording, rather than absolute values. (#45052)

GEM_PGSVR_COMPRESS_PAGE_TRANSFERS only compressed one direction

The Gem configuration option GEM_PGSVR_COMPRESS_PAGE_TRANSFERS, when true, compresses page transfers from the Page Server to the Gem. However, it did not compress transfers in the other direction, from the Gem to the Page Server. (#44783)

GciEnableFreeOopEncoding and GciGetFreeOopsEncoded broken

The encoding functions in the GCI were not working correctly. GciGetFreeOops failed if called after GciEnableFreeOopEncoding, and GciGetFreeOopsEncoded always failed. (#44738)

NetLDI socket leak on incompatible connection

When the NetLDI opened a connection to an incompatible version of GemStone, there was a code path in which the socket connection was not closed, resulting in a socket leak in the NetLDI. (#45902)

StringKeyValueDictionary removeKey: may fail incorrectly

Dictionaries locate keys using hash, and may lookup Strings and Symbols interchangeably, or (in Unicode comparison mode) mix Unicode and traditional Strings. However, removeKey: used = to compare keys. It was therefor possible for at: to return a value, but removeKey: to fail. (#44097)

StringKeyValueDictionary>>at:ifAbsent: did not use block on nil argument

The methods StringKeyValueDictionary >> at:ifAbsent: and StringKeyValueDictionary >> at:otherwise: returned the error #rtErrNilKey (error 2090) when the specified key was nil, rather than executing the block supplied to handle nil keys. (#45367)

Risk of SEGV when VW on Linux invokes C code

When C functions are invoked from the VisualWorks executable, the stack pointer is not aligned correctly 16 bit. This is related to changes in the stack alignment rules in recent versions of gcc, vs. the compiler versions used for VisualWorks executables. The potential for problems is when using GBS on Linux with application user actions (which are deprecated in GBS), and depending on the compiler version and optimization used for the user actions. (#42632)

Topaz issues

Linked topaz retained some changes to internal globals over logins

A customized value for IcuCollator default, IcuLocale default, and Locale, were retained in a linked topaz process, and persisted through logout and login. (#45532)

Topaz nbstep command broken

The topaz command nbstep returned a message not understood error on #_stepOverInFrame:mode:replace:tos: (#44590)

Topaz shell does not work well with lineeditor

Using the topaz shell command may not work with the lineeditor enabled. The lineeditor is enabled by default. (#45249)

Topaz filename expansion did not recognize ~ in path

A tilde, representing the home directory, is now recognized by topaz commands that use a filename, such as output push and fileout tofile:. (#44018)

Hot and Warm Standby issues

Standby system may lag behind primary significantly after large reclaim

When a large reclaim operation occurs on the primary system, the same reclaim operation will need to be completed on the standby system. This could take significant time, during which period the standby lagged behind the primary. This was difficult to address on the standby, since the configuration of reclaim could only be identical to that on the primary (changes required commit, which is disallowed during restore). (#44265)

In v3.3, the code that coordinates between the transaction log restore and the ReclaimGem has been made more efficient. You may now configure reclaim on the standby differently than on the primary, using the run-time configuration options described under Runtime configuration of ReclaimGem and AdminGem

Hot standby connection may timeout with large number of tranlogs

When there are a large number tranlogs in the master systems tranlog directory, the logsender-logreceiver connection may timeout. (#45473)

Possible for logreceiver to get tranlogs from multiple repositories

In hotstandby systems that are incorrectly configured, it was possible for the logreceiver to reconnect to a logsender that is associated with a different repository than the one previously connected to. Further information has been added to tranlogs so that such inconsistencies are detected. (#43124)

fork: did not handle single quotes correctly

GsHostProcess class >> fork: accepts a String argument containing a command line. This string correctly handled double quotes, but not single quotes. (#45386)

Checkpoints not written during replay of possible dead objects

Stopping and restarting the Stone during the restore of transaction logs that include a reclaim of possible dead, required that the reclaim restart from the beginning. Now, checkpoints are written during the replay. The main impact of this would be on warm and hot standby systems. (#45901)

GsFile does not handle file names with extended characters

When passing a string that includes characters with codepoints outside the ASCII range to a GsFile file open method, the resulting filename may be truncated or contain invalid characters. (#45326, #44786)

startcachewarmer handling of NRS in stone name

In some earlier releases, using NRS syntax in the stone name argument to startcachewarmer results in a crash; in later version, the startcachewarmer would not crash or error, but it did not perform the warming operation. (#45604)

Upgrade and Conversion issues

Conversion problems with very large Symbols

GemStone/S 64 Bit does not allow Symbols larger than 1024 bytes. However, repositories converted from 32-bit may contain Symbols that are larger, including ones larger than 8K that require LargeObjectNodes for internal storage.

During the conversion to 3.x, startstone -C resorts AllSymbols; this sort operation will fail if any symbols over 8K are present, causing the startstone -C to fail. (#45923)

As of v3.3, the startstone -C conversion will not fail; any symbols over 8K are removed from AllSymbols. These symbols remain as uncanonical symbols and can be removed or, if unreferenced, will be removed by garbage collection. It is recommended that before conversion, you examine your repository for large symbols and convert appropriately.

Filein class definitions that fail in _equivalentSubclass:... did not provide details

When a class definition is filed in, it invokes the method

Class>>_equivalentSubclass:superCls:name:newOpts:newFormat:
newInstVars:newClassInstVars:newPools:newClassVars:inDict:
constraints:isKernel: 

to determine if a new version of the class should be created, or if the definition is the same in all ways. When this method determined that the classes were not the same, it did not provide details about the specific cause of the failure, which made analysis of upgrade/conversion issues difficult. (#45836)

ClassOrganizer instances persisted through upgrade had missing symbol list

Persisted instances of ClassOrganizer from a pre-3.2 repository, when upgraded to 3.2 or later, were not fully initialized and encountered errors on use. (#44199)

Upgrade issues with #GemStoneRCLock

There were code paths in upgradeImage, in which the handling of #GemStoneRCLock was incorrect and could result in upgrade failure. (#45857)

filein of older code could introduce comment class method

In version earlier than 3.1, classes comments could be implemented by defining individual #comment class methods. In v3.1 and later, the comment for a class is in a separate field in the Class, and user classes should not reimplement class methods comment or comment:. Now, filing in older code that includes comment or comment: methods will get a compiler warning. (#44566)

Improved handling of login when repository upgrade required

When attempting to login to a repository that has been started on extents from an older version, when upgradeImage is required and has not been run, previously one of a number of unspecific errors would occur. Now, this is now handled explicitly with the new error 2486, RT_ERR_UPGRADE_WARNING, and a message that repository upgrade is needed. (#44183)

SymbolGem that is slow to exit is made invalid, not handled cleanly

When the SymbolGem was requested to stop and does not shutdown within the timeout, the Stone sets its cache state to invalid. The terminating SymbolGem reports this as a lostOT, and does not correctly handled the exit. (#45265)

Compiling method with Array SymbolDictionary required #OtherPassword privilege

When compiling a method (using compileMethod:dictionaries:category:), with the target dictionaries: argument being an instance of Array, the #OtherPassword privilege was required. This is not correct; this should require #CodeModification, but not #OtherPassword. (#44433)

Inconsistent lock status after failed commit on temporary locked object

If a commit fails when there is a lock on a temporary object, after an abort the lock status becomes inconsistent. The lock cannot be removed, and methods to access lock status are inconsistent. (#45744)

SlotsTotalCount off-by-one

The value recorded by the ShrPcMonitor cache statistics SlotsTotalCount was always one higher than the actual value. (#45713)

gslist -x did not show startnetldi -D value

The -D option was added in recent versions of GemStone. The value of this argument was not reported in the results of gslist -x. (#45935)

Previous chapter