OpenEdge 4GL Unknown Value and the LITERAL-QUESTION Attribute

December 21, 2007 · Filed Under Development · Comment 

Sometimes a cigar is just a cigar, and sometimes a question mark is just a question mark.

You frequently need to assign character values to a buffer-field. Sometimes that value may be a question mark.

        bh:BUFFER-FIELD(’c1′):BUFFER-VALUE = ‘?’.

When you query the buffer-value, Progress reports that the value is unknown.

        MESSAGE
            bh:BUFFER-FIELD(’c1′):BUFFER-VALUE = ‘?’
            bh:BUFFER-FIELD(’c1′):BUFFER-VALUE = ?
            VIEW-AS ALERT-BOX.

If this behavior is not what you intended or you need to keep the Unknown Value distinct for a “?” character, then you have a bug in your program.

How do you get Progress to treat the “?” appropriately? The trick is the LITERAL-QUESTION attribute.

When LITERAL-QUESTION = FALSE, “?” and ? will both evaluate to the unknown value.
When LITERAL-QUESTION = TRUE, “?” is a question mark, and ? is the unknown value.

        bh:BUFFER-FIELD(’c1′):LITERAL-QUESTION = TRUE.
        bh:BUFFER-FIELD(’c1′):BUFFER-VALUE = ‘?’.
        MESSAGE bh:BUFFER-FIELD(’c1′):BUFFER-VALUE = ‘?’ VIEW-AS ALERT-BOX.

Sadly, there is no single-step way to make it always be true for all fields in a temp-table, but something like this may work for you:

        FUNCTION SetLiteralQuestion RETURNS LOGICAL
          ( INPUT bh AS HANDLE ):
          DEFINE VARIABLE i AS INTEGER NO-UNDO.
          IF NOT VALID-HANDLE(bh) THEN RETURN FALSE.
          IF bh:TYPE EQ ‘TEMP-TABLE’:U THEN bh = bh:DEFAULT-BUFFER-
        HANDLE.
          IF bh:TYPE NE ‘BUFFER’:U THEN RETURN FALSE.
          DO i = bh:NUM-FIELDS TO 1 BY -1:
            bh:BUFFER-FIELD(i):LITERAL-QUESTION = TRUE.
          END.
          RETURN TRUE.
        END FUNCTION.  /** SetLiteralQuestion() **/

You can then invoke the SetLiteralQuestion function using either a table-handle or a buffer-handle.

        SetLiteralQuestion(TEMP-TABLE ttFoo:HANDLE).
        SetLiteralQuestion(bh).
        SetLiteralQuestion(INPUT BUFFER customer:HANDLE).

Since this does some looping and other processing, you’ll want to make sure you’re not calling SetLiteralQuestion() in a loop.

OpenEdge ABL Phantom Error

November 30, 2007 · Filed Under Development, Reliability · Comment 

At Solvepoint we’ve coined a new term, the Phantom Error. What is a Phantom Error you ask? Well, it is an undesirable circumstance where the Progress VM decides to raise the error condition but leaves error-status:get-message() set to the empty string.

This , of course, can lead to a host of ugly sorts of bugs not the least of which is not having any idea where the error happened or why.

Here is a simple example that demonstrates a Phantom Error:

DEFINE NEW GLOBAL SHARED VARIABLE myHandle AS HANDLE NO-UNDO.
main: DO ON ERROR UNDO main, RETRY main:
  IF RETRY THEN do:
    MESSAGE RETURN-VALUE error-status:get-message(1) VIEW-AS ALERT-BOX.
    LEAVE main.
  END.
  RUN someProc IN myHandle.
END.

You’ll notice that both return-value and get-message(1) return blank.

If you happen to be in a terminal based procedure editor, however, you will receive the error message in the “message area” at the bottom. Unfortunately, this doesn’t do much good for server code.

As a consequence, server code will typically swallow these Phantom Errors at worst, or report the error out of context at best.

So, what can be done? Diligent error trapping is called for. Protect the code by always testing handles before using them.

Unfortunately, this isn’t the only type of code that will produce a Phantom Error. We will discuss other Phantom Errors under the Tag “Phantom Errors” in other posts.

Little-known OpenEdge Database _User Table Security Behavior

November 30, 2007 · Filed Under Development, Security · Comment 

Since it is not well known, I thought that I’d bring the following information to light regarding the OpenEdge _User Table Security Behaviour.

  1. OpenEdge Database security is implemented at the table level not the connection level. This means that, even with “blank user disabled”, a client may still connect. Security is only checked when that user attempts to access a table. Take careful note of the confirmation message:

    You are about to prevent the blank userid from
    accessing your database. Users who are not
    listed in the data security fields will not
    be able to compile procedures with this database
    .

    Notice it says nothing about connecting. Which brings us to point 2.

  2. Table level access is compiled into a .r. Therefore an _user without any access to that table may still run the .r successfully (even the client connected without a user id).
  3. The _user row may be modified or completely removed without affecting the running sessions (even after a STOP is raised).
  4. The _user password can only be changed when connected via that _user.
  5. The _user row , however, may be deleted by any session that has _can-delete on the _user table.
  6. The same _user row may then be created again with a different password (yikes!). (Still does not affect any running sessions).

Please keep these behaviors in mind as you plan your security strategy.

PS: Please be aware that Progress says it will be modifying security including adding run-time capabilities sometime in the near future.

OpenEdge ABL Memptr Pitfalls

November 30, 2007 · Filed Under Development, Reliability · Comment 

Memptr is a very powerful datatype in the ABL/4GL. It allows the programmer to store any type of data including binary. However, as with all dynamic objects in Progress, one must be careful when using it.

Pitfall number one: Scope. Memptrs do not follow the rules of scope to which 4GL programmers have become accustomed.

What does this mean? Why do I care? Well, I’ll tell you. Since Memptrs do not follow the rules of scope, when a variable holding a Memptr HANDLE goes out of scope the associated memory is NOT released. You care because if this happens you now have a memory leak. Every time your program is executed it will leak memory equal to the amount allocated to your memptr.

Pitfall number two is the allocation process. Probably 99% of code you find will do this:

/* define a memptr variable */
def var m as memptr no-undo.
/* Allocate the memory */
set-size(m) = 1024.
/* now go ahead and start using it… */

See anything wrong with the above? If not, don’t blame yourself. You are used to Progress doing this for you, but in this case, it does not. What am I referring to? You C programmers will know! The memory that has been allocated to m has not yet been “initialized”. This means the memory will contain random data: whatever happened to be in there before the allocation. If you are using put-string before your first get-string (without the numbytes parameter) then you have nothing to worry about since put-string automatically puts a NULL (0) as the next byte after the string and get-string will only read up to that NULL. But for other operations like put/get-byte or put/get-bytes or put/get-string with the numbytes parameter, grabbing random uninitialized data out of memory could bite you, so beware.

The final pitfall is also related to allocation. In your code, you may define a memptr at the beginning of a procedure and then use it in several places throughout the code. In each use you will want to allocate the appropriate amount of memory. So you may code something like this:

def var m as memptr no-undo.
set-size(m) = 128.
/* do stuff with it here… */
set-size(m) = 1024.
/* do other stuff with it here … */
and so on…
/* now we clean up and return */
set-size(m) = 0.
return.

This looks great right? We are allocating and cleaning up just as we should, right? Well, yes and no. The pitfall is that the second set-size where we allocate 1024 bytes doesn’t actually allocate anything. It essentially does nothing at all. AND it does not raise an error condition. So now we have a potential bug if the code attempts to put more than 128 bytes into that memptr.

This is solved by setting the memptr to 0 first.

Moral of the story, memptrs need special care as they do not particpate in conventional scoping and cannot be resized until they are cleared.

Hope this helps to save you some time in your coding efforts!

OpenEdge Memory Management Anti-pattern

November 30, 2007 · Filed Under Development, Performance Tuning, Reliability · Comment 

Hello all,
I’ve recently been reviewing some Progress 4GL and have found an all too common anti-pattern related to memory management.

When a variable is defined, the Progress runtime client (Virtual Machine) allocates memory at runtime for that variable. Once the variable is out of scope, the memory is released and everyone is happy. Progress programmers have grown comfortable with this design and obliviously define variables whenever they are needed knowing that they will be de-allocated automagically by the Progress VM.

Then came dynamic objects.

Progress programmers were overjoyed! They could now create temp-tables, buttons, queries all on the fly at runtime. No more convoluted if-then statements or .i’s or having to code a different “for each” for every combination of where clause.

However, as with any power bestowing feature, there is a dark side to this wonderful new world of dynamic 4gl: memory management. Most programmers never really stopped to consider the fact that if something is created dynamically at runtime, the VM has no way of knowing the scope. It cannot tell when to release the memory required for the dynamic object. REMEMBER: the scope of the variable you happen to assign the object to HAS NO BEARING on the scope of the OBJECT since it can be passed around. In other words, the scope of the variable holding the HANDLE to the object is NOT bound to the OBJECT itself. The scope of ALL OBJECTS are always at the SESSION. This applies to GUI widgets, dynamic queries, temp-tables, etc.

Java (and other VM’s) solve this through the use of a separate execution thread running concurrently called a Garbage Collector. Its job is to scan memory and find dynamic objects that are no longer “reachable” and release their memory. Unfortunately, the Progress VM has no such thread/concept.

To add insult to injury, not only does this leak memory but it also causes progressively worse performance: The more widgets in memory, the more time it takes to create another widget. Here are three examples (run on a 2.1ghz processor):

Button Handles

Create 1000 Button Handles

Minimum memory required/lost per Button Handle: 512 bytes

Query Handles

Create 1000 Query Handles

Minimum memory required/lost per Query Handle: 1024 bytes

Temp-Table Handles

Create 1000 Temp-Table Handles

Min. memory required/lost per Temp-Table Handle: 512 bytes

So, as you can see, from both a memory and CPU footprint standpoints, it is very important to be sure to clean up your objects.

This may seem like a large number of handles, but remember two important points:
1. This is at the session level. This means that if a.p calls b.p which creates objects then those will exist for the life of the session: THERE IS NO SCOPE other than SESSION FOR DYNAMIC OBJECTS and they are NEVER automatically reclaimed!
2. If the programs are running as part of a long-running session such as AppServer, EagleIQ server or Webspeed, then you have to consider the cumulative affect over days, weeks or months.
Also note that if it is a temp-table, it could potentially have a much larger memory footprint.

So, what must be done?
It is up to the Progress programmer to clean up each and every dynamic object created using the “delete object” command.
It may be appropriate to create a widget-pool in which to assign your objects so you can just delete the pool and all the objects within will be released as well. In fact, if you create a non-persistent widget-pool, it will be automatically deleted when it goes out of scope. Creating the object into a non-persistent pool will make it behave as if it were scoped at the level that the widget-pool is created: in effect, making it behave as if it were statically defined.

If you don’t use a non-persistent widget-pool, then It is also important to be sure the “clean up” code is executed even when there is an error. For example, the following will bleed memory if an error condition is raised within the blah blah:

      procedure doQuery:
          def var qh as handle.
          create query qh.
          /* so some business logic here */
          do while true:
               blah blah
          end.
          delete object qh.
     end.

However, if you create a non-persistent widget pool, then it is automatically deleted when it goes out of scope. So the following will not leak memory even if an error condition happens:

procedure doQuery:
    def var qh as handle.
    create widget-pool “wp”.
    create query qh in widget-pool “wp”.
    /* so some business logic here */
    do while true:
         blah blah
    end.
    delete object qh.
end.

The widget-pool may be defined at the .p level as well. In this case the pool is deleted when the .p is exited.

Oh, by the way, persistent procedures and memptr’s are two other constructs that have a session level scope. However, they cannot be part of a widget-pool and therefore must be handled individually.

10.1B Changes Integer Math Wrapping Behavior

November 22, 2007 · Filed Under Development, Reliability · Comment 

An encryption component built into one of Solvepoint’s products came to our attention in a regression test on 10.1B. The root cause was a change in integer math wrapping behavior. Notice, I said “change”. I did not say “improvement”.

Some background, first. In most major computing languages integers wrap when an operation overflows the maximum allowed integer.

  • In C# int i = int.MaxValue + 1; wraps.
  • In ANSI C unsigned integers wrap.
  • In Java integers wrap.

Integer wrapping behavior is an embedded, predictable and necessary part of many applications including a number of encryption algorithms, communications stacks, and data verification algorithms.

What about Progress applications? Well,…

All versions of Progress prior to 10.1B wrapped integers.

With the coming of 64-bit integers someone at Progress realized that it was possible to do 32-bit arithmetic in 64-bits and actually know whether the result went beyond the maximum (or minimum) integer. But just because something could be done doesn’t mean it should be done. So even though the Decimal data-type exists as an easy alternative to those wanting to avoid classic integer behavior, Progress changed integer math in 10.1B to throw an error instead of wrapping. If you’re incredulous, see Solution ID P119716.

In 10.1B 32-bit integer wraps throw an error “Value integer too large to fit in INTEGER datatype. (13682)“. Be prepared for some code written long ago to break.

If we multiply an integer less than the maximum 32-bit integer by a multiplicand that will result in a value greater than the maximum 32-bit integer, we can see the issue played out. In the following example we will use 1234567891 (a large easy to remember number less than max 32-bit int, +2,147,483,647) and multiply it by 10. In Pre 10.1B the result is -539222978, but as of 10.1B it returns an error.

Difference between 32 bit signed integers in OpenEdge 10.1B

Looking at the binary underneath this, see where 10.1B detects the overflow within 64-bits and throws the 32-bit exception #13682:

How OpenEdge 10.1B detects overflow

But can we now declare all Progress integer math safe? Not really. Since 64-bit integers in 10.1B are as blind to the >64 bits as 32-bit integers were to the >32 bits before 10.1B, 64-bit integers in 10.1B wrap around the way 32-bit integers did before 10.1B. (read that sentence several time if need be ) To help you understand why, Progress says “Checking for 64bit overflow would be too expensive. To do it would require putting everything in 128 bits to do calculations and then to copy it all back, since >64bits are required to do 64bit overflow checking. This would cause all arithmetic in the 4gl to grind to a halt. There are therefore no plans to do anything about this.”

So you’ve been warned. When this bites you’ll be able to say, “Darn!, and I read something about that somewhere!”

PS: And yes, for all the hard-core C jocks, signed integer wrapping is implementation dependent, not guaranteed to demonstrate modulus behavior, but usually wraps. Use of a signed integers rather than unsigned integers when classic integer wrapping is needed in C is a no-no.

Dataserver for Oracle 8i De-supported.

November 22, 2007 · Filed Under Administration, Development · Comment 

According to Progress the “Next Major Release after OpenEdge 10.1A” will not have support for Dataserver for Oracle 8i. Progress Software lists Dataserver for Oracle 10g as the Replacement Feature. Extended Support ends December 2007.

Progress defines “De-supported” usage as: “De-support is used where changes in technology or standards have made a feature obsolete and it is removed from the OpenEdge product. De-supported features have replacement equivalents and have zero impact on backwards compatibility.”

If you are stuck with an older embedded version of Oracle and need to move beyond OpenEdge 10.1A it may be time to explore available alternatives.

Most Dataserver users are not affected by the decision to De-Support Oracle 8i as they are able to upgrade Oracle.

OpenEdge ASSIGN Statement - More than just performance.

November 22, 2007 · Filed Under Development, Performance Tuning, Reliability, Trivia · Comment 

Back in the early ’90s when I broke the news of the ASSIGN statement in a Profiles in Progress article, I had no idea what silliness would follow. I also had no idea that twenty years later we would be seeing newly written code that still gets it wrong.

For the few who are as yet unaware, wrapping consecutive assignments with an ASSIGN has multiple benefits.

a = 4.
b = 3.
c = 7.

isn’t as good as

ASSIGN a = 4
       b = 3
       c = 7. 

Shortly after the Profiles in Progress article was published there were signs that some in the Progress community were getting it wrong. Even though the article clearly graphed variations in execution time, some performance pundits started running around saying the ASSIGN statement was 2.7163639281 times faster than not using the ASSIGN. I’m exaggerating the precision of the number, of course, to make a point. None of those digits (including the initial “2″) are significant. None.

The ASSIGN statement varies for many reasons, not the least of which are whether what’s being assigned are components of an index in common, or part of a key that fully qualify a unique index. Let’s demonstrate one of these two factors using the example above.

Let’s suppose that a, b, and c are components of an index a-b-c. And let’s start, for simplicity’s sake, with a, b and c all having the value of “1″. In the ASSIGN-less code snippet, the index a-b-c will move through two transitional values, 4-1-1 and 4-3-1, before it gets to 4-3-7. Using ASSIGN, index a-b-c becomes 4-3-7 directly. Not using ASSIGN causes three-fold the activity:

ASSIGN DB Activity

Further, if index a-b-c is unique and the transitional values 4-1-1 or 4-3-1 already exist for another row, then the code without the ASSIGN statement will fail. Yes, fail.

Here’s a diagram that illustrates this index key collision for the above example:

ASSIGN Index Collision

The ASSIGN statement is not just an optional performance improvement as some believe. In some contexts, likely those least considered, lack of ASSIGN will affect reliability. Reliability is more important than Performance. Performance is also impacted as, in the example above, the application database must now perform triple the number of index lookups, inserts, and removes. Not good.

So everyone is using the ASSIGN statement, right? Sadly, no. One would think that we should find proper use of the ASSIGN statement it in all code written since the 80’s. Sadly, this isn’t the case –even in newly written 2007 code. We’ve seen it with our own eyes. As a community of Progress users, I know we can do better.

While this article isn’t an exhaustive presentation of the reasons why using ASSIGN is better, I’m hoping that the reliability and performance example above is compelling enough. It should be.

Hope this helps.