Yeah, I know that I am late with my RBS vs. EBS information but in the interim, here's a great article about RBS from Todd Carter...
So unlike little old RefArch 6, this guy is the real deal. In this architecture we aggregate the actual content from a multitude of SharePoint sites. The content is transparently taken from SharePoint's control then stored and managed in a truly aggregated single location. In theory, there are a number of different ways of doing this:
Architecturally the net result is that all of your content from all of the key SharePoint sites ends up being stored in a single, centralized location. It is then managed as a single set of content, it can have hierarchical storage management applied depending on the value of the content, it can be managed with a single set of security & compliance policies, it can have duplicated content thrown away...if you want to see an example of this architecture in play then take a look at the more advanced email archiving solutions in the market.
So what's the best way to do this? So herein lies a bit of a Blogger's dilemma; rumor has it that quite a few people read my Blog - most importantly, you do and you are my favorite reader. I do actually have an answer to how this is best carried out but I'm sorry to say that I am not going to share it at this point. If it were entirely in my hands I'd consider it but the solution to this problem requires a fair amount of collaboration and technical assistance from Microsoft and from some of our key partners and I am not willing to divulge the details until we at EMC have made a bit more progress. Sorry.
Don't you just love it when this happens? You get to the end of a 2 month Blog saga and the penultimate page has been torn out. Hope you don't hate it too much because part 8 is going to have a similar ending. In my defense, the solution to this problem will blow your socks off and will be worth the wait.
If you are under an EMC NDA and are attending EMC World this year then maybe, just maybe we can chat about the details.
Craig Le Clair from Forrester Research posted an interesting article last week, (I was on vacation skiing so oddly enough I didn't read it until this week). In it, he suggests that the upper limit on a SQL 5 installation running on a 64 bit architecture is about 500GB - oddly enough 5 is the average number of drinks I had each night when I was on vacation last week, there were 64 runs on the ski slopes and I fell over exactly 500 times on vacation last week. I suspect that Microsoft will not agree with these numbers but in truth they are pretty realistic, I really did limit my drinking, go to a big resort and fall over a lot ;=)
Seriously though..I am sure that Microsoft will dispute the 500GB limit so I'd like to propose an informal pole, (not a ski pole which I used last week on vacation). Do you have more that 500GB running somewhere and if so - what is your architecture?
Did I mention that I was on vacation last week?
I had a really interesting comment posted on my "SharePoint 2008 Report" by Marko Sillanpaa. Marko is half of the team over at http://www.BigMenOnContent.com and has been around the ECM space for a long time. His question was interesting enough for me to promote the answer to a blog entry. Here's Marko's comment, (reproduced without his permission or any regard for whether he minds or not.)
Thanks Andy. This is a great conference summary for those of us who wanted to be a fly on the wall.
I’d like to ask a question about your first bullet. You said that SharePoint sees themselves as a platform. How should we look at Documentum, a solution or a platform? It’s easy to see how a solution and a platform would work together. But when I look at two solutions to solve the same problem or two platforms to build the same basic sort of solutions, I just don’t see the value.
It sort of like why have both SQL Server and Oracle in the same space. If I need to build a database application, I’d look to the one that does the most for me. I wouldn’t put some tables in Oracle and other in SQL Server. Nor would I necessarily start in say SQL Server and then move them over to Oracle, unless I was looking to warehouse the data. Maybe I answered my own question. Is Documentum a data warehouse for SharePoint data?
This is by far my most 'rambly' posting ever so I've highlighted the actual response in bold at the bottom. Ignore the rest, it is complete self-satisfying drivel.
Let's start with the most obvious question - is Marko calling me Andy because he knows that I hate that or was it just him being overly familiar? (FYI: I am not short of nicknames for Marko if we want to start playing the silly-name-game so bring it on.)
Secondly, my comment was actually that the attendees at the conference viewed SharePoint as a development platform. I know that Microsoft do but I was surprised that every single person with whom I spoke agreed 100%. I loved Marko's analogy of SharePoint and Documentum being like two database systems. He succinctly asked the question that I think a lot of people are pondering right now. Are SharePoint and Documentum just different flavors of the same pie?
I wonder what Marko expects me to say or what he thinks I might want to say if I wasn't under 24/7 Microsoft Taser watch. Would I say that we've been doing it longer, they don't know what they are doing, they are the new kids on the block, they'll be gone in 12 months, Microsoft is just an overgrown startup, SharePoint will never catch on, my ECM system could beat up your ECM system..? The answer is...all of the above!
Seriously though, it is a great question and no longer just some academic, positional or defensive debate. The truth is that the pundits who really understand what is happening in the market are past the "it is a head-to-head" competitive situation argument. The relationship between a traditional ECM solution and SharePoint is much more subtle yet fundamental than that.
Before I start my longwinded analysis I'd like to address a word that I hear banded around with regards to this relationship. "Coopetition" - this is used to describe the situation where two companies are simultaneously competing and cooperating. Firstly, I hate made-up words, (except voluntold - to describe when you are told to be a volunteer), and secondly that's not a true description of the relationship between Microsoft and the traditional ECM vendors. Let there be no doubt - Microsoft are competing head-to-head in some areas of the business but in my ever-so-humble opinion they are competing head-to-head in less areas than they are not. Grammatical elegance notwithstanding, what I mean is: the gaps between SharePoint's capabilities and a traditional ECM solution are larger than the areas of overlap.
Here are a couple of great questions - Why is there such a big gap? Are Microsoft incapable or incompetent? Let me answer the first question, (the guy with the Taser is staring at me right now so I might not address the second question at all.) It has taken Documentum 14 years to build the capabilities of their ECM system. If I was starting from scratch I'd realize that even the behemoth that is Microsoft could not do it all in a reasonable length of time. If I was Microsoft, owner of the desktop, master of the productive application & champion of development tool where would I focus? Hell, I'd make sure that my solution was not just unified or connected to the desktop - I'd make it a core, native part of that environment, I'd close that relationship tighter than a...insert your own euphemism here.
I'm an amateur quantum physicist, (really I am; I'm studying string theory and multidimensional universes right now), and I know that in a parallel universe somewhere this posting slipped back in time and influenced Microsoft to take this approach so kudos to me for this foresight.
If I seem to be rambling on interminably and you are getting bored then it is because I am high right now; 37,000 feet and 5 cups of coffee high to be exact. I'm on a 5 hour flight back from Seattle and if I am stuck in 14C bored you might was well share some of my pain.
So, SharePoint and traditional ECM solutions overlap. They both have a repository, they both provide library services, (view, check out/in, etc.), they both have clients, etc. I'd contend that SharePoint's focus is on integrations in to the Microsoft desktop at the client and integrations in to Microsoft environments at the back end. So where would a conventional ECM solution add value to a deployment of SharePoint?
There are many more examples but the bottom line is...if you view SharePoint as owning collaborative, in-progress, manually created Office-centric data then it looks like a pretty good fit. If you then add traditional ECM to manage heterogeneous, specialized, multi-channel, high-value, long-term, process intensive content then you truly will have the best of both worlds.
The guys from Microsoft might not like the following analogical response to Marko's original analogy but cut me some slack, can you think of a better one? Here's my response: I see the "Oracle vs. SQL Server" comparison as being more like "MS Access vs. Oracle". Microsoft Access is an amazing database, especially if you work in a Microsoft environment and are doing standalone, departmental applications; it does everything a database needs to do but within a limited scope. If you wanted to roll out a long-term, shared, enterprise-strength, secure, scalable solution you'd either go with Oracle/SQL Server or perhaps you might back-end Access with an centralized Oracle/SQL Server system that integrates in to your other enterprise solutions. Obviously, SharePoint is a much larger, more integrated and more extensible solution than Access but as an analogy I do see the comparison as being valid.
Here's how I see it from my ivory tower - SharePoint is a great access point in to the world of real enterprise content management. It created an entry point in to full blown ECM that 100,000,000 consumers can now access from their native working environment.
Let me preempt Marko's next question, isn't it likely that Microsoft will simply grow SharePoint in to a true enterprise content management system? Let's be clear, I am not party to Microsoft's internal long term world dominance strategy, (unless you assume that world dominance is indeed the strategy), but from how I see SharePoint being positioned in the market and how it plays in to Microsoft's core market I'd guess that they'll certainly mature in the enterprise areas but never really address it to the level that a true ECM system would today. Also bear in mind that any savvy ECM vendor is not going to sit still, they will continue to dominate in the areas that SharePoint is not prepared to address. My assumption is that SharePoint will focus more heavily on being pervasive in to Microsoft's core businesses and will turn in to a hybrid of a development platform, pseudo file system and an operating system overlay for corporate usage rather than a heterogeneous ECM system. Why do I think that? Because the former market is much bigger, less high maintenance and more aligned with Microsoft's portfolio.
Comments? Other than expounding the virtues of brevity or decaf coffee.
In the previous reference architecture we unified SharePoint and the ECM system at the Web Part layer and provided a limited subset of functionality, specifically passive operations. There's no doubt that this "passive connectivity" provides a huge amount of value, uses a nice, familiar paradigm and can constitute a complete solution in some cases. However, there are many times when you need to be able to act more fully upon the content in the disparate systems rather than just browsing and viewing the documents.
For illustrative purposes, let’s simplify the underlying architecture and then look at some typical problems with performing more active actions on content. Assume that behind our unifying Web Part we have two separate SharePoint sites. Our web part displays a unified view of all of the documents in the “Standard Operating Procedures” folder in one site and all of the documents in the “Corporate SOPs” folder in the second site.
Take a use-case where Standard Operating Procedures in the company need to be routed for approval before they can be published for general consumption. Approval of documents such as these is performed via a workflow process; let's look at this process from an end-user's perspective.
Matilda has two standard operating procedures that need to be updated. She finds the two documents in her unifying Web Part and checks each of them out of their respective SharePoint sites. She marks up the necessary changes in the documents, checks them both back in, re-selects them both and starts the “Route SOP for Approval” workflow.
So what's the problem?
This seems like a particularly simple and highly efficient scenario but it makes some gross assumptions. Assuming that one SOP lives in the first site and the other lives in the second site then Matilda was actually starting one workflow in one system and another workflow in the second system. What if one of the SharePoint sites did not have a workflow called “Route SOP for Approval” implemented? What if it existed but the name of the workflow was subtly different? What if it existed in both locations but Matilda had access to the workflow in one system but not in the other?
The problem is made worse if the underlying architecture consists of different types of systems. What if one SOP was in SharePoint and the other was in Documentum? Do you have single sign on implemented between these systems? Would Matilda even have an account in the Documentum system? You get the idea - even when both systems support the same operations the implementation and invocation of those systems will be significantly different.
In order to resolve this issue the unifying web part has three options:
Don't get me wrong, these options are not without merit, option #1 is actually a fine solution if the end user can do her job with the "dumbed down" functionality. If you run a tight IT operation then #2 might work but if someone points the unifying Web Part to an incorrectly configured repository then it could cause wholesale chaos. Option #3 on paper looks like the preferred solution but it takes a lot of work to build this functionality and a lot of cycles to perform this real-time look up, resolve and display, (aka it will be slow).
Under the right circumstance I believe that this architecture can provide a very attractive solution for well defined problem areas. If you have control of your repositories, a well run IT department and a fairly competent development team then you are probably a fictional organization but in a good position to use this approach.
Next week tune in for the first in the series of data aggregation as an approach...in the spirit of Harry Potter movies I am thinking about making reference architecture #8 in to two blog entries - not to make more money but to ensure artistic integrity - honest.
I spent last week in Seattle at the SharePoint 2008 Conference. As a opportunity for me to attend sessions it sucked but for a good reason; I spent almost the entire conference talking to customers, partners, analysts, competitors & the Microsoft SharePoint team - it was...what's the word? Illuminating is probably most appropriate followed closely by exhausting.
Let me pick some of my top observations - serious and humorous:
- SharePoint: Solution or Platform? Before last week's conference I think that I viewed SharePoint as being a business solution or at least a broad set of capabilities. Admittedly, a solution that needed development work in order to solve a specific business requirement. Almost without exception the attendees at the conference viewed SharePoint as exclusively being a development platform. I am not saying that they don't roll out SharePoint almost out of the box to solve business problems but that's almost an aside. They see SharePoint as being a platform that gives the developer access to a plethora of integration points in to the user's working environment. Bear in mind that the audience was primarily IT not developers so this was not "I am a developer so everything is a platform" situation.
- Business Opportunities Outside of Core SharePoint: Next year I am going to buy extra tickets for the conference. Would you believe that there were people lined up at the door who did not have tickets and were hoping somehow to get in on spec. I'm thinking next year they should have a parking lot for tailgating and people hawking tickets. Maybe they could have some up for auction and they could give the monies to charity. Seriously guys, you should do this...or maybe just get a bigger venue.
- ECM or Data Center Management? I used a presentation at the conference that was based on my world renown and revered eight reference architectures; it was the first time that I had used this deck. I re-learned a lesson that I should have not had to re-learn - know your audience... Here's the thing, if you are a proponent of enterprise content management then my deck probably made sense but if you live exclusively in the world of SharePoint then you could not care less. That said, take 3 or 4 of the reference architectures and label them "Managing SharePoint Data Center Sprawl" and you'd have more attendees than that Gates chap had at the keynote! Better still, call it "Preventing the need to slow down your SharePoint deployments because off the liability and risk caused by silos in the data center" and you'd get Microsoft's blessing, (marketing might have something to say about the title though.)
- SharePoint Might Just Catch On. I am pretty sensitive to things...ask anyone who works for me, Mr. Sensitive they call me. I'd have to say that this SharePoint nonsense might actually be successful. I heard a lot of people call the current 2007 version "raw" and I'd tend to agree but there are a lot of compelling things that SharePoint brings to the table. Some of them still do not make sense to me, (portfolio management as a core capability to be added for example), but perhaps that's just because of where my attention is focused right now.
- How Orange is Too Orange? If you attended the conference this is for you, otherwise it will not make sense. I sent my deck to EMC marketing for approval before submitting it and their feed back was "It is a bit orange." I guess that's better that them trying to change the actual content. Rather than addressing this directly I added something orange to each slide; I struggled to find 32 orange things to plaster on to the slides but being a consummate overachiever I managed. I'd like to think that no one left my session with less vitamin C than when they entered!
- When in the USA, Only Expect One Person in the Audience to Understand a Father Ted Joke: Whether SharePoint is really an enterprise content management would indeed be an ecumenical matter - Thanks for laughing Mick!
- Who reads This Drivel? People from Microsoft do...would you believe that they not only still talk to me, they actually treated me really nicely - they even bought me dinner one night. You'd think they'd know better.
- The EMC Microsoft Practice are Excellent: As you know, I tend to take a fairly agnostic approach to the fact that I work for EMC and try not to sound like an extension of the EMC marketing machine; I am of course always 100% loyal to anyone who is paying me. That said, I cannot tell you how many people commented on how great these guys are - it turns out that EMC has over 400 certified Microsoft Professionals in the practice. It is without a sliver of self interest that I expound the skills of these people just 1 week before I start the process of getting them to provide massive amounts of resources to my projects free of charge. J
- The Hilton Seattle. By far the worst Hilton I have stayed in ever...IMHO. Avoid it.
I've registered for Tech Ed but will probably not get a chance to re-do the presentation at that event but I'm looking forward to yet another geek-fest!
If you are lucky enough to have a ticket to the hottest show in town next week then I invite you to come along and hear me talk about the eight architectures, (it says seven in the abstract because I had to submit it before the eighth one arrived). I'm speaking at 10:30AM on Tuesday March 4 in Room 616-617.
If you want to chat about any of these topics, especially if you have feedback on the architectures then pop over to the EMC booth and they will know how to get hold of me.
I'll be asking lots of questions, validating my thoughts and also seeing what Microsoft and the partner community are up to. I cannot promise that I'll Blog during the conference but if I can find time I certainly will.
I represent the previous three architectures as being “unification” however technically they do not actually unify SharePoint with the ECM system. The first architecture simply recognizes the problem, the loosely coupled solution provides movement of content and the Web Parts specific solution just creates a simultaneous view of the two disparate systems.
This fourth architecture provides a level of real unification albeit only at the client level. There are a number of content types that might benefit from this client-level unification depending on the actual functionality needed. For example you might unify data between discussion threads, datasets, messages, inbox entries, etc. However, we will take the most frequently use-case which is the unification of documents between systems. (I'll Blog about those other structured and semi-structured data types later.)
The objective of this architecture is to create an environment where the end user is not aware of where an object actually resides. If the end user needs to see three documents in order for her to do her job then she should see all three documents in a single Web Part; we should abstract their actual physical location from her. If she does need to know where the data resides physically then that should simply be a value in a column – we certainly should not insist that she navigate three separate systems or even three web parts in order to get access to the data.
In this reference architecture, a single SharePoint Web Part is used that is able to federate queries out to many disparate systems. In laymen's terms, the Web Part grabs information about documents in 'x' different systems but shows the results in a single list.
If the end user simply needs to search lists of relevant documents and then view the content then this reference architecture is relatively simple. If the user needs to perform more complex operations then we start to see serious technology issues, (Note that Reference Architecture 5 will address that scenario.)
The SharePoint Web Part contains logic that “binds” it to predetermined data in the disparate systems, for example, it might be mapped to folders 1 & 2 in the ECM system, folder X in one SharePoint system and folder Y in another SharePoint system. When the user invokes the web part it queries each of the systems and returns the results in a single unified results set. Other uses for this architecture include displaying a search results set; in that case it would take a cross-system query, federate the search across systems and then consolidate the results.
One point to note as we move through these architectures is that as we increase the functionality we also increase the technical challenges. This will become very apparent in the next architecture but just consider a single subtlety in this one relatively constrained case. Mapping user security...Consider this example; I am logged in to SharePoint, I invoke this Web Part which queries 4 other ECM systems and 2 other SharePoint Document Libraries. I don't mean to be existentialistic but who am I in these other systems? Do I need to have credentials on those other systems? Do they need to be mapped to my home SharePoint account? Can I rely on SSO for this? Who will maintain this? Where will the mapping between systems be held? Should I just act as sysadmin in the remote systems and throw security and data confidentially out the window?
If you can deal with the cross-system issues then this approach starts to show some promise. Users should not care where their data lives - they just want access to it and this approach offers that. Be aware, this approach addresses only the 'passive' use case. I don't mention check-out/in, lifecycle management, workflow, BI, etc... Add those to the equation and...well you'll see in the next reference architecture!
I am not sure that this information is really supposed to be shared with the general public, I overheard it in the exec bathrooms in Hopkinton this week so that makes it pretty "public" domain to me. It turns out that EMC is involved in an evil plot to thwart the future of science. We are buying up all of the cool mathematical and physics-related equations in the world and will then hold them to ransom.
EMC - Obviously we already own Einstein's most famous equation.
Pi - Looks like we are buying the most amazing mathematical constant in the world.
RSA - RSA is actually the same of an algorithm for public-key cryptography.
Rumor has it that we will be buying Binomial Theorem Inc, Quadratic Equation Ltd and the letter 'x'.
Watch this space...