This post by Steve Nelson and this follow-up by Nat Papovich over at the Webapper blog has generated a lot of comments about the method-local (var) scope, the THIS scope, and the VARIABLES scope within a CFC. The result? It appears there's still a lot of confusion on these three issues.
I fought the long and difficult battle of getting these straight a while ago, and in my complacency I assumed that everyone else had as well.
Nope.
So I thought I'd post my own thoughts since I've already written a small book in the comments over at Webapper. I don't want to debate the fact that we have to declare var-scoped variables. Adobe is aware of this issue and if they can fix it or find an easier way to handle it without breaking too much existing code then I'm sure they will. I want to go more basic here and just make sure everyone who reads this understands these three absolutely critical elements of developing with CFCs.
First, the THIS scope. Inside a CFC, the THIS scope is public. That means any external code can access and modify any data in the THIS scope. Which is why the THIS scope is so dangerous. One of the most important jobs of an object is to encapsulate its internal implementation and state. The THIS scope breaks encapsulation in a terrible way. You're basically opening the guts of your object to the world, not only to see, but to change. Bad, bad, bad! For these reasons, most CF folks avoid the THIS scope. And unless you are very sure you know what you are doing and have a very good reason, you should too. Honestly, the only time I've seen a good use of the THIS scope is in a Transfer Object, which is nothing but a typed structure anyway.
Next, the VARIABLES scope. The VARIABLES scope is very similar to the THIS scope, except it is private. This means external code can't access or modify this data. This is a Good Thing. By hiding your data in a private scope, you force external code to access your data by using the public methods of your CFC. This is also called the API for the CFC. Forcing client code to use your API leverages encapsulation, and gives you a lot more freedom to make changes to how your CFC does things and stores internal data. As long as your API methods remain the same, you are free to change the internal implementation of your CFC as often as necessary.
Finally, the VAR scope. Within a method, if you want to declare a variable that only should exist for the duration of the method call, you must var scope it at the top of the method body like this:
<cfset var user = "" />
I think most people would agree that it's annoying but that's irrelevant. You HAVE to do this. And this is probably one of the most critical things anyone can know about using CFCs, so I'm putting it in bold: If you don't var scope a variable, that variable is set into the VARIABLES scope of the CFC. Let this become a mantra.
To put it into code, this:
<cffunction name="myMethod"> <cfset userName = "Brian" /> </cffunction>
Is the same as this:
<cffunction name="myMethod"> <cfset variables.userName = "Brian" /> </cffunction>
Depending on the situation, this can just be bad, or it can be horrible.
The bad: if the CFC instance only exists for one request (called a "transient" or "per-request" CFC), a non-var-scoped variable can cause unexpected behavior if the variable is var scoped in some methods and not in others. It's also very possible for your CFC to be transient in one context or application and persistent in another. So you should assume the worst and var scope it even if you think your CFC will only be used in a transient way.
The horrible: if the CFC instance is cached in the session or especially the application scope, non-var-scoped variables are almost guaranteed to cause crazy bugs that will seem impossible to track down. This is because multiple server threads can be using the CFC instance at the same time.
So to summarize: DON'T use the THIS scope unless you have a very good reason to. DO use the VARIABLES scope to keep your CFC's data private. And DO VAR-SCOPE all method-local variables no matter what.
Comments (31) |
del.icio.us
|
Digg It!
|
Linking Blogs
| 9802 Views


# Posted By Ryan Guill | 2/6/07 8:41 AM
Good post. I also just want to point out that there is one other method local scope that we always use but seem to forget about: the arguments scope. You can set variables into the arguments scope inside of a method, and will be gone after the method ends. Whether that is best practice or not probably depends on your situation, but you can use it that way.
# Posted By Peter Bell | 2/6/07 9:33 AM
Hi Brian,
THANK YOU!!!! I also though a post like this was unnecessarily obvious - until I saw Steve's post and the associated comments.
Required reading. If you know it, no harm, no foul, 30 seconds wasted. If not and you've been *wondering* why you were getting those strange occasional bugs in your singletons, probably the most valuable post of 2007 :->
# Posted By Javier Julio | 2/6/07 9:47 AM
A great post as it holds up very well. I have had no problem with using the var keyword and always assumed it was the only way to go because in other languages its always been the same but maybe I'm wrong. I personally love using the var keyword (although it would be nice to have an option not to use it but still keeping vars local in function) because I can directly tell that a variable is local to that function. Otherwise I will automatically think its part of the variables scope. I don't use the this scope at all (only when in my cfreturn for my init). Again a great post. This should clear things up.
# Posted By Ben Nadel | 2/6/07 10:31 AM
Brian, excellent post... but I have to say that I am a bit tired of people bashing the THIS scope all the time. Yes, it's easy to change, but honestly, due to the highly dynamic nature of ColdFusion, if someone wants to change a variable, there is nothing really stopping them from injecting "spy" methods into an existing CFC and destroying it. Privately scoped objects just make it harder for someone who is determined to perform shenanigans.
Granted, I am not a huge OOP guy, so I do not speak from experience (maybe that is an advantage, maybe not)... but one of the HUGE benefits of the THIS scope is that it can be accessed without a method call. If you have an object that is heavily traffics and you have control over the whole coding environment, than accessing the THIS scope can actually save a lot of processing time (I have seen this happen, it is not a myth).
So anyway, I don't want to argue about it... I just want to tell people that THIS can be cool, she's just as nice and pretty as the rest of the scopes, don't write her off.
# Posted By Brian | 2/6/07 12:19 PM
If you aren't doing OOP then these rules do probably go out the window. But in OOP, you will almost NEVER see anyone use public instance variables. The most fundamental element of OOP is encapsulation and public instance data blows encapsulation out of the water.
Though I must admit, I've built many high-traffic sites using OO techniques and never found the fact that I'm calling methods to manipulate an object to be a performance issue. There is a simple rule that states that there can only be one bottleneck in an application at a time, and in CF (or any web apps in general) this bottleneck is almost always the database structure/SQL code being used. Worrying about using the THIS scope instead of calling a method on a CFC would be way,way down on my list of probable bottlenecks in an application.
# Posted By Peter Bell | 2/6/07 1:27 PM
Brian =- +1. Premature optimization and you're losing benefits of encapsulation. OK for small/short projects, but encapsulation is one of the nicest things you get from OO - especially as your apps grow. Keeps the refactoring much simpler not having to worry who else might be accessing your public vars.
# Posted By Justin Alpino | 2/6/07 2:33 PM
Nice post...
Sort of off topic at this point from the direction of the latest comments, but one thing that I find less than appealing with var scoped variables (aside from having to explicitly define them) is that you must make the declarations before any other processing or variable declaration occurs in your method. When you have a large method with lots of looping or such, you could end up having quite a few var scoped variable sets, which imo is ugly. I've found that creating just 1 var scoped structure to which I add key to, negates the need for multple declarations, for example:
<cfset var local = structNew()>
...Do a bunch of stuff ...
<cfloop from="1" to="10" index="local.i">
......
</cfloop>
It’s not a new concept but I thought I would share it anyway, incase other developers feel the same way I do about having multiple var declarations.
# Posted By Brian | 2/6/07 2:38 PM
@Jason: you said "When you have a large method with lots of looping or such, you could end up having quite a few var scoped variable sets, which imo is ugly."
All true, but IMHO what this really means is your method is doing too much and should be refactored. Lots of small, cohesive methods always wins over fewer big, complex, procedural methods.
# Posted By Justin Alpino | 2/6/07 3:10 PM
@brian - I agree, I'm a big fan of keeping methods short and concise also, and in keeping with the DRY principle that certainly makes sense to break things out, but I was speaking more in general terms (and probably should have used different words to express it). I see that you employ the same technique in your methods (ie. CFCStubGenerator.parseCFCText()). The point I was trying to make was to offer an alternative to having many var scoped declarations.
# Posted By Adam Cameron | 2/6/07 3:19 PM
Good post, Brian.
I'm mostly inclined to agree with you about the THIS scope, but I think that if one treats it as a "write once" property of the object it's OK. What I mean by this is that I have no problem exposing some stuff to the calling code via the THIS scope (for expediency's sake), but THIS-scoped variables only EVER appear on the left-hand-side of an expression within the CFC itself, so there's no(*) chance that the calling code can bugger up the CFC instance.
That said:
1) I hardly ever have had call to do this;
2) It'd be so much nicer if they could be set as read-only / final / something.
--
Adam
(*) Well: "hardly ever". I'd be thinking long and hard about using this technique if the CFC was ever destined to be cached in some persistent scope.
# Posted By Dave Shuck | 2/6/07 3:20 PM
I realize the CF8 wishlists played out months ago, but when I read the line "I think most people would agree that it's annoying but that's irrelevant", it crossed my mind that I wish they would add 2 things to CF8:
1) var variable1 variable2 variable3 = ""; // or something like that
2) getting rid of the restriction of declaring var scope variables at the top of the method.
# Posted By Qasim Rasheed | 2/6/07 9:35 PM
Brian,
Excellent and timely post which will surely help people to understand the true nature /limiation/features of various scopes within a CFC. Just to reinforce what you have already said, here is a PDF document by Ray Camden listing all CFC scope and their purpose.
http://ray.camdenfamily.com/downloads/cfcscopes.pd...
# Posted By Michael Dinowitz | 2/7/07 1:26 AM
I see no reason what-so-ever not to use the THIS scope. The religious argument of "breaking encapsulation" is just that - religious. The argument that anyone can set it is not true either. Who is altering the this scope. Where is the code that is doing it is more to the point? How is direct access to the this scope different than a general getter/setter (other than being faster)?
I can (and will) go on about this but the bottom line is the same. I disagree with the chant of "no this" and think people should really think about it first before joining in.
# Posted By Brian | 2/7/07 1:37 AM
Michael, you would not want someone to view or modify data in the THIS scope in almost all situations for the same reason you wouldn't want anyone to view or modify private instance data: there is more to an object than its instance data.
I think people from a more procedural mindset often focus on the data an object contains (and the related bean-pattern-mandated getter and setter methods) at the expense of the BEHAVIOR of an object.
Say I have a Product object. Within the object is an instance variable named price. What is the harm in letting people directly access (not even change though that is even worse) this instance variable? I would argue there is great harm, because the client code is bypassing the API (public methods) of the object and looking directly at internal state and internal implementation.
Right now my Product may only have a simple instance.price variable. So using THIS and bypassing my getPrice() public method might not seem so bad. But in the future, when my getPrice() method does all sorts of crazy tiered discounts and bulk discounts based on the price variable, very possibly using other objects to make these determininations at runtime, if my client code isn't going through the getPrice() method, I'm going to be in trouble.
And this doesn't even get into the arguably much more dangerous issue of letting external code CHANGE the state of my object directly. At least using private setters I can make instance data read-only.
Basically there are a whole lot of reasons to not use the THIS scope and exceedingly few reasons to do so. I'll never say "NEVER" use the THIS scope, but I will say I think in almost all situations it is a really bad idea.
# Posted By Michael Dinowitz | 2/7/07 1:50 AM
Um, who is this someone? We're talking code here, not people. What code should not see the data in the object and what exactly would happen if it did. What code will bypass a public method? Can I as a user access the this scope of an object? Can I alter it? I think the answer is no.
We're talking code here. And again, the question is what code is changing what state, when, where and how is this different than most other changes to a state?
I'm writing a blog entry as we speak on this. If we step away from the 'someone' and look instead at code the question becomes what code can touch the object, who can alter the code and does all this really matter.
# Posted By Brian | 2/7/07 2:03 AM
"Someone" is any external client code or system.
The bottom line is that the effort and overhead to protect the state of an object and force client code to go through the object's API is miniscule. If my previous example of the perils of directly accessing instace data, especially in terms of future change, doesn't make you reconsider your position, I'm not sure what will. You're essentially arguing against encapsulation, which is the foundation of every OO language in existence and has been lauded by virtually every great programmer you can find.
I suppose that until you experience the pain (and I have on too many occasions) it seems rather nebulous. Using my previous example, if you don't use myObject.getPrice() and instead use myObject.price and later you need a lot more complex behavior when asking an object for its price (a very likely possibility), that is when the cost of directly accessing instance data will become apparent.
So again, the cost of forcing external code to use the object API is virtually nothing, but the benefit is very great.
# Posted By Brian | 2/7/07 2:34 AM
Just a follow-up: the well-established OO approach to "encapsulate by convention, reveal by need" is at the core of what I'm trying to say. It's much easer to break encapsulation later, if you need to, than it is to try to add it back in. It's kind of a Pandora's box.
# Posted By Toby Tremayne | 2/7/07 4:07 AM
I have to agree with Brian specifically for the last comment he made (before this one).
Like a lot of best practise stuff, the need for it isn't seen on the first pass of writing code. It's when you're making changes, doing maintenance or ading outside calls (3rd party apps) to it that you find the need.
One example is that if you have a shopping cart object, and various parts of your shop muck about with the data directly in the this scope, then if later on you need to make that cart much more complex (perhaps adding tax, promotion calculations etc) then you may well have to edit all the places in the shop that talk to the cart.
If instead you'd had the shop calling the getters and setters within the cart, you can simply make the changes within the cart object, and the rest of the shop would get the results you want it to.
# Posted By Kola | 2/7/07 6:01 AM
Just to further elaborate on Brians comment:
-- "Someone" is any external client code or system. ---
If you have a component which forms part an API which is to be used by other client code on other projects, by other people on other systems etc. then it becomes even more critical to shield the client code from the internal implementation - they *should* be using the published public api - developing code which is strongly tied to the internal implementation of another component can lead to maintenance nightmares when that code/library/api is updated - particularly if its an open source library
# Posted By Michael Dinowitz | 2/7/07 6:25 AM
Brian and Toby:
Your examples make the case for setting and getting based on 'if we want to change later' but what of cases where we only expect the data to be exposed in a simple fashion, such as when the data is about the object itself? The application.cfc uses the this scope to contain data about the application itself, such as the application name. If I have a cf-talk instance of a list object, what's wrong with having the listid or listname exposed in the this scope? There is a need and the information describes attributes of the object itself. would that not fall under the oo paradigm?
And to follow up on your statement about it being easier to break encapsulation later, it doesn't seem like that would ever be done based on the same rules that make us want to use encapsulation in the first place.
# Posted By Ben Nadel | 2/7/07 8:03 AM
I feel like I must be a horrible programmer (and this is not the first time) because I seem to be the only one who ever feels that their page gets slowed down by the number methods calls that takes place :( Perhaps this goes hand in hand with me being a bad OOP programmer and I guess, now that I think about it, perhaps this is the reason that I am like one of four people on the planet who think CFFlush is a requirement for proper page rendering.
Let me bring this discussion over to Javascript as I feel that I highly proficient in Javascript, at least as opposed to OOP in ColdFusion. Now, let's look at the document object model. Each DOM object is an object right? It has properties and methods that are available for it.... so how come each of these things available to the DOM not all method calls? Why is there:
DOM.parentNode
DOM.previousSibling
DOM.nextSibling
DOM.nodeType
DOM.childNodes
... but, then on the flip side, there ARE methods like:
DOM.getElementsByTagName()
DOM.appendChild()
... Is the belief that Javascript is just a poorly thought out OOP language? Maybe my problem with the whole bashing on the THIS scope is because I feel I work with a lot of really successful situations that do not use it.
Now, you might argue that things like parentNode are read-only and that javascript will throw an error if you try to modify it. I would argue that that is a moot point. If you are not supposed to modify a THIS-scope variable, and then someone does it anyway, well.... that's not a programming problem, thats a problem of incompetence of the programmer???
And what about constants in Java objects. I use Java objects that always have constants for reference OBJECT.CONSTANT_VALUE. How come these are not accessed as static class methods?
(I am NOT attacking with these questions... I am very curious as to why the inconsistency across languages and situations).
# Posted By Scott Stroz | 2/7/07 10:35 AM
I agree with Brian K on this one.
When I am using <cfcomponent> to create objects, as opposed to a library of functions, and I use the THIS scope, it stops feeling like an object, and more like a structure which just happens some methods you can access. And, to me, this just doesn't feel right.
# Posted By Brian | 2/7/07 10:55 AM
@Michael: "where we only expect the data to be exposed in a simple fashion" sounds highly suspicious to me. I'd be looking for the "but" or "until" or "unless", if you see what I mean. Humans are pretty bad at predicting the future.
That said, you will note that I have been saying "almost never" with regard to using the THIS scope. There are times (look at at a Reactor Transfer Object) where it is probably OK. I just think those times are very few and far between.
Application.cfc uses the THIS scope and it was a bad decision on Adobe's part, IMO.
Finally, there may indeed be times when you want to break encapsulation later. It is possible. But the same care should be take in deciding to do it as was put into making things encapsulated in the first place.
@Ben: Yes, JavaScipt is a horribly thought out OOP language. More to the point, it is a non-OOP language that had OOP features bolted onto it over a very long period of time.
Regarding Java constants, these are more acceptable because they are almost always marked as FINAL. That means once they are set, nothing else can change them.
You said your "page gets slowed down by the number methods calls that takes place", which is true, but only very slightly. Unless you have debugging with Report Execution Times turned on. I'd argue that the vast majority of the time, this tiny performance hit is worth the flexibility and ease of maintenance that using API methods provides.
And it is interesting and timely to mention consistency. Another reason to use methods as the norm is consistency. If you have some data that you want to expose directly (the THIS scope) and some that you need to require client code to access via a method (anything subject to change in the future), you now have an inconsistent interface to your object. Again, there is very little overhead in just having methods that client code can use.
# Posted By Peter Bell | 2/7/07 12:25 PM
+1 on the method calls. The benefits from OO come mainly from encapsulating things and that requires method calls. I also used to worry about this but don't any more. If I found a real performance bottleneck with my method calls, I'd probably just throw more servers at the problem. If that wasn't an option I'd drop down to Java or even if it was crazy critical pay someone else to drop down to c++ which is what you want to be writing in where performance is that much more important than your time.
# Posted By Jim Priest | 2/7/07 2:12 PM
I think CF8 should announce a new "THAT" scope so these discussions can get even more confusing... :)
Thanks for the great post - I'm just getting started in Mach-ii and OO so all these discussions are very informative!
# Posted By Michael Dinowitz | 2/7/07 4:21 PM
I was going to mention an application with 363 method calls and the need for speed. I was going to mention page based cfcs rather than cached ones using the this scope. I was going to make more arguments for it but I'm just to tired and gun shy after recent events. You can see it in my last post, this one and the lack of a next one. I still think there are places where this would prove useful but the argument just doesn't matter.
# Posted By Brian | 2/7/07 4:34 PM
There ARE places where it might prove useful. They are just very rare.
I have a hard time believing that the overhead of 363 method calls added up to more than a few milliseconds, but if it did, and performance took precidence over all other design elements, including future maintainability, then you might have a perfectly valid case. I shy away from premature optimization like the plauge. And when optimization is required, I focus the majority of my effort on the most common performance hog: the database design and the SQL.
Not trying to pounce on you Michael. We all know you're a sharp guy and love all that you do for the community. I just feel strongly that encouraging the use of the THIS scope is a dangerous recommendation, especially to new and intermediate developers. If you really know what you're doing and truly have a valid need to use it, then go for it. I'm just saying those cases are going to be extremely rare.
# Posted By Peter Bell | 2/7/07 4:48 PM
Hi Michael,
I think everyone respects you and the work you do, so I wouldn't take this stuff personally. It is like Mixins - they can be a great tool but it is important to teach the dangers as well as to teach the possibilities. Please keep up the great work and the great postings - I always learn something from your postings and would hate to miss that just because people are clarifying the downside of using your specific tools in more general situations.
I think biggest issue is that you are in the business of doing something that most programmers should not be focusing on. You help people to optimize performance of systems and as a general approach to developing that is a horrible starting point, but when you have a creaking server and bring in Mr. D to do his stuff, it is *exactly* the right thing to focus on.
That said, I'd drop down to Java before you'd take my method calls from my cold dead fingers - but that's just me :->
# Posted By Ben Nadel | 2/7/07 8:03 PM
@Brian [Yes, JavaScipt is a horribly thought out OOP language.]
I figured as much.... I still love it though :)
@Brian [Another reason to use methods as the norm is consistency]
I think this is one of the biggest selling points (at least in my young-OOP mind). I am a fan of consistency and use it everywhere from my naming conventions to my white-space usage. So why no with variable access??? So yes, I agree, consistency is a HUGE selling point.
# Posted By Rob Gonda | 2/8/07 12:00 AM
The var scope is a beautiful thing, resembles Java, As2/3, but I just wish you could declare a local var anywhere in the function ... that's my main beef with it; it's silly. Why do you have to declare it at the top of the function ... it complicates mixins, people miss scoping queries, there's all kind of problems because of the lame way it's implemented...
# Posted By Donnie | 4/24/07 1:41 AM
Jumping on this bandwagon a bit late but just noticed a strange thing, if I passed in an argument to a function and then declared a var-scoped variable with the same name CF throws an error saying you cannot declare a variable twice. I think this is a bit odd....
E.g.
<cfargument name="foo" type="struct" />
<cfset var foo = structNew() />
is invalid....one would think that these would be completely separate in terms of accessing them...