Wednesday, October 31, 2007

SharePoint Site, SSP Profile and Active Directory Users

This post was derived from an email response to a question about how I think things work with respect to users ("people") and Active Directory, SharePoint, our Lotus Notes email, etc. Specifically, there is detail about users who do not yet exist in a site directory. It is based on research, testing and our experiences. In our environment we create profiles for everyone in our Active Directory domain and add email addresses, names and a little organizational information from another SQL database.

When you create an alert (or use other people lookups), you can pick anybody from the Active Directory domain. In any of the SharePoint look-ups you are selecting these from a listing that is BOTH active directory and SharePoint groups, if groups are shown. In our environment, any of those users should have a profile in SharePoint (even the temporary accounts are imported, but many of those do not have email addresses) - the profiles are created and updated by two processes we run daily.

In practice it would be invalid to pick someone to whom you have not granted access to the site. Although the alert setting would be created, nothing would ever be sent to a user that has no rights to a site except for a notice that the alert was created.

People (site collection users) and profiles are not the same thing, but there is synchronization. If you add a user to a site and they were not previously in the site collection, they get added to people on that site collection and I'm not sure when their email address is looked up from the profiles (I think it depends - immediate if you send them a welcome email, and slightly delayed if you don't).

If you try to set up an alert for someone who has not been previously added to a site collection (or for any other reason does not have an email address - like most administrator accounts, many temporary accounts, etc.) you will get the message (trapped error):

The following users do not have e-mail addresses specified: Username, David. Alerts have been created successfully but these users will not receive e-mail notifications until valid e-mail addresses have been provided

Set my e-mail address...
Troubleshoot issues with Windows SharePoint Services.

Where "Username, David" was my demo user. The "set my email address" won't work for non-administrators and the "troubleshooting" won't be much help, but the main part of the message is correct - the alert is created. If the user has a profile with an email address, the system will set it up in the background and the user WILL get alerts IF there is anything they can access (but clearly the user still needs access to the site). If you had previously added the user to your site, you don't see this message after the email address has been synced.

I thought about prepopulating users in a members list, but I don't think this will be necessary unless we get a lot of site administrators having this problem. Only a someone with "manage alerts" permissions - a site owner - could ever have this problem. If we start seeing this we can look into a having prepopulated "members" list.

Further, it is good to note that there is a timer job to keep the site collection up to date with the profile.

I know this wasn't the best written post, and it may not have wide generic applications. You have to keep in mind that we do not use Exchange, so our Active Directory is pretty weak - it does not even have email addresses, we have to go get those from another database. Another interesting topic to consider would be alternatives to profile creation in advance, and a more standard view of how all this should work (with Exchange and a better AD).

Sunday, October 21, 2007

How to View a Museum

When I was 35 (in 1992) I went to New York City for the first time. I had gone to run the marathon, but I had the whole day before (October 31st, by the way, Halloween) to walk around and check things out, arriving on an early morning Continental 737 with only three other passengers.

I had made almost no plans other than the race. After I registered for the race (they had an amazingly UN-impressive race festival, after seeing what Revco used to do in Cleveland) I got to my hotel pretty early and so I went walking. Along the way I identified everything, without planning to be sight seeing. I knew I was right around the corner from Times Square. Carnegie Hall was obviously Carnegie Hall. Central Park was right where it should be. I realized that had I come to NY when I was younger I never would have left. Somehow it was already part of my psyche, and this has been the case all of the many times I've been back.

To make a short story extremely long (I NEVER do that!), I ended up at the Metropolitan Museum of Art. They had a huge Magritte exhibit. After walking in and seeing the scale of the museum, I decided to take in the Magritte and not much else - that would be enough. The artwork was in concert with me. I was alone, and that was a rarity at that point in my life, with two young children at the time. Wonderful art museum experiences were not new to me, but this one made me realize that this was the perfect way to go to a museum - ALONE.

My patronage of the Rock and Roll Hall of Fame and Museum goes back to the ground breaking ceremony, where I got to shake hands with Chuck Berry. I have been a Clevelander my entire life. As an adolescent I hung around downtown and took art classes and stuff and I have worked downtown almost continuously since 1984. So I have a lot of history with the Rock Hall, and I go there pretty frequently (probably on average 15-20 times a year - my office and desk look out on the building).
October 19, 2007

Most of my visits are shorter than I would like, but short visits to any museum beat long ones. I have the benefit of frequency of visits. I usually have a target exhibit - either one of the featured, temporary or borrowed ones. Or a targeted film to watch. Two of my favorite places to sit are the induction ceremony videos (they should sell these - I would buy them) and the hall of fame videos. The latter has a very good sound system and nicely edited video. Both of those also tend to be very up-lifting and the hi-fi is very pleasing to my audiophile ears.

So here are the museum conclusions:

  1. Go alone
  2. Keep visits short
  3. Know what you want from the visit - have a target
  4. Really try to absorb the exhibits - let them take over the moment

Thursday, October 11, 2007

Out of the Box SharePoint and Lotus Notes Integration

*Note: this need a bit of revising. In further tests I found that address data can screw things up for Notes

I should rephrase the title as NEARLY Out of the Box (OOTB). You will see in a bit. I won't claim to have all the answers here about how and why Notes and SharePoint do what they do, but I will document my findings.

First, I don't want to discuss why anyone would want to use either of these products or choose to use them together, nor will I discuss the relative merits of these products and alternatives to them.

Second, the three big areas where my firm needs integration are calendars, contacts and content. This entry concerns the first two of these. It is my belief that you can tackle these independently.

One could treat this as a glass half empty or half full kind of thing - Notes and SharePoint have a surprising amount in common. After all, Lotus and Microsoft engineers were the co-originators of the iCal format, so it shouldn't be surprising that there is support for the standard. Some of this stuff works really well. But I am being overly positive.

It seemed like presenting this information in a matrix might work, and I may still go back and change things around that way. But for now, so as to organize my observations and thoughts, I will just list each of the observations I have made:
  1. If you invite a mail enabled SharePoint calendar to a meeting from Notes, the calendar will nicely display the meeting, accept updates, etc. The Sharepoint document also has an ICS file as an attachment (handy, you'd think).
  2. Sharepoint contacts lists cannot accept email.
  3. OOTB SharePoint contacts and calendars lists have Export Event buttons in their items. Clicking these presents a file download dialog for and iCalendar (ICS) or vCard (VCF) file. There is also an Export Contact selection on the edit menu for contacts. These exports work - more on that later.
  4. There is a nice, straightforward iCal Exporter piece out on Codeplex ( It's one little WSP file and you add the solution and then activate it. From then on all calendars in the site collection will have an Action menu item to export the entire calendar to an ICS file.
  5. If you drag and drop an ICS file on a Notes calendar, it will open a dialog with options for "importing" one or more calendar entries.
  6. ICS (iCal) files attached to emails can be "viewed" or "opened" in Notes and will act the same as they do with drag and drop.
  7. When you choose to import an iCal, Notes creates an entry in the inbox which you then open and add to your calendar.
  8. (As far as I can tell) you cannot open a iCal file in Notes by starting with a command line or by simply opening it from the SharePoint site's export download dialog. This really bugs me.
  9. You cannot get a hold of that nice ICS file I described in #1 and do anything cool with it from a simple workflow (like mail it to people).
  10. As long as you have .VCF associated with Notes, you open a vCard directly into Notes (from either the edit menu or the button inside the contact). You can open vCards just about any way you want in Notes - drag and drop works too.
  11. Sharepoint does not accept vCards. I don't understand this. It seems like a glaring omission.

My conclusion is that we are part way there without doing much. Opening iCals in Notes is funky in that it doesn't work from the file system or from the web and files presented inside the Notes client require specific user interaction and result in new calendar items in the inbox. But it works. Inviting a calendar from Notes works very nicely (getting updates and stuff). Contacts work fairly well, but only one at a time. From here comes the hard part - closing the loop and making all this usable.

Tests were done with Notes 6.5.3 clients and MOSS 2007.
A couple references:

This entry explains a lot, but not why you can't just open the file with Notes.

And this is a script for Notes to export. This would be easy to try but hard for us to implement (we are always very deliberate with mail template changes). It seemed interesting.

Thursday, October 04, 2007

Search and Research

The major references from Microsoft that I have found cover:

  • How to do basic setup (file types, how to set which content, etc.)
  • Capacity (but in the context of THEIR scale, which is hardly anyone else's at this point - terabyte index for crying out loud)
  • Some details - a little minutia, but not necessarily enough to give you all the answers (this is where I am living right now)

I have specific concerns in a couple categories - capacity planning (space, processor, etc.) and functional specifics (what is really indexed). I worry that what limits you set could have a profound influence on the former, you would think. But maybe not.

Our users are concerned about having all content completely full text indexed (it seems redundant - "complete" and "full" - but it's important!). Early on I had lots of questions about meta data vs. attachments, lists vs. libraries, etc. but for now I am strictly concerned with the file content in document libraries. More specifically, how do I balance the maximum upload and search setting and what are the repercussions of incomplete crawls due to size ("The file reached the maximum download limit. Check that the full text of the document can be meaningfully crawled") or compressed content (index grow factor - more on this in a bit).

There are two relevant settings, MaxDownloadSize and MaxGrowFactor. It was proposed that we set the MaxDownloadSize and the maximum upload to the same size - 50MB (we had been using a larger maximum upload). MaxDownloadSize and MaxGrowFactor are registry settings on the indexer and the upload size is a Sharepoint central admin setting). The grow factor is a multiple of the original file's size - for a factor of 4, if the original file was 1MB and was compressed, when the indexer's filter uncompresses and starts adding text to the index, it cannot add more than 4 meg.

My tests had two goals - (1) determine the functional ramifications of the settings and (2) find out how storage was impacted. For now I am not interested in how long it takes to complete, network or SQL impact, etc. None of those things are much of an issue for me for the time being.

For my initial tests, I had the default MaxDownloadSize and MaxGrowFactor of 10MB and 4MB respectively. I have a test database that has 50,000 documents in it and a significant amount of content larger than 10 MB. I also created some control content that consisted of large PDF's, some with a text source and some OCRed, and large .DOC and .DOCX files (mostly text, some graphics).

The results were ugly. With PDF's over the 10 MB Maxdownloadsize, no content was indexed (apparently the searchable parts are not in the first 10MB). My word tests are inconsistent and not extensive enough to be conclusive. It seems that sometimes the whole doc could be ignored and sometimes not. Word 2007 documents seem to work well, even when they are too big.

My plan was then to up the MAXDOWNLOADSIZE to 50 MB (and not mess with MaxGrowFactor). But at first I followed the advice of the technet entries below and it did nothing. After several passes at this, I started to look around at services and realized that there must be another registry key. But once I found where I thought it should go, it wasn't there. I now had something different to search and was able to confirm though an entry written by Bill English that you need to add the key for MAXDOWNLOADSIZE for the MOSS search (not WSS, unless you aren't MOSS):

HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Office Server\12.0\Search\Global\Gathering Manager

Then I added 50 as the decimal value for the Dword key.

Now I could re-run the tests and expect some results, and that's what I got. There were no longer any warnings:

“The file reached the maximum download limit. Check that the full text of the document can be meaningfully crawled."

None. This is interesting, because some of my test PDF's didn't show up in the results (I had one that was 1700+ pages, mostly text - 47MB. But most of them did work. It was an arduous test, and what I learned so far is that the settings make it better, but it is still not perfect. We previously had done a lot of this kind of work with Lotus Notes databases. Notes search is very different (and probably superior in most regards, like most things Notes). But Notes also had known issues with some PDF's not searching correctly, and with Notes it is not so easy to install a different ifilter, at least as far as I know.

Incredulously, the database and local index files were actually smaller. Let me explain. The search database was about 30% larger, but there was a significant amount of free space. After a shrink, it was smaller. The local index files on the server were a bit smaller. I felt that I had good controls on the experiment but I don't understand that part of the results. I think we will just be very careful and monitor the growth of the index database.


Tuesday, October 02, 2007

Moving Sites Around - Bad File/Folder Names

As we are a law firm, we frequently have verbose file names and folder names. We have done some training and try to have everyone keep it to a paragraph or two :-) and limit the punctuation, that and we have a little file renaming tool for mass importers to use.

An interesting situation came up when moving a site that had a correspondence folder receiving lots of emails. As the email subject line may include all kinds of illegal characters and be way too long, when Sharepoint creates folder in which to stash the attachments, it seems to do this: display the incorrect name and truncate the actual pointers. Although I really need to research this more, I am in fix mode, so it is just an observation.

These correspondence folders with crazy long names seem to work OK in normal operations. It only becomes a problem when trying to do an export/import (move the site somewhere). The export writes without errors, but the import blows up badly. You end up having to try to rename the folders. This doesn't work well using the UI, but it works OK in Windows Explorer.