Wednesday, December 28, 2016

Excel Add-in Development | Portability

Designing and developing portable Excel add-ins across different versions of Microsoft Office (2000-2016, x86-x64) is an important goal for keeping maintenance costs low whilst enabling you to grow and support as broad a Office customer base as possible.  This article describes how to do just that by combining two open source .NET libraries: Excel DNA and Net Office.

 

Excel DNA is a .NET library whose API’s simplify writing COM-free Add-ins with User Defined Functions, custom Ribbons, RTD components, and VBA script friendly API’s for Office task automation.  The project started by Govert van Drimmelen in 2005 has a broad user community across the financial and scientific domains.

 

Net Office is a .NET library whose API’s allow one to target multiple versions of Office 2000-2016 without referencing the platform specific PIA and VSTO assemblies as well as managing the COM proxy objects.  The project started by Sebastian Lange in 2011 also has a large user community across the Office platform.  The Net Office API allows one to interact with the Excel object model (e.g. workbooks, worksheets, ranges) in a Office platform independent manner.

 

Architecture

 

 

Implementation

1.     Create a .NET class library in your preferred language (e.g. C#, F#, VB.NET).

2.     Add references to NuGet packages of Excel DNA and Net Office.

a.  ExcelDna.Integration

b.  ExcelDna.Addin

c.  NetOffice.Excel

3.     Define a class that implements ExcelDna.Integration.IExcelAddIn interface as well as the System.IDisposable interface.

a.     Implement the Singleton pattern.

 

            public class MyAddIn : IExcelAddIn, IDisposable

       {

        // singleton variable

        private static MyAddIn m_Instance = null;

        // instance variables

        private object m_ComApplication;

        private Application m_ExcelApplication;

        private bool m_Disposed = false;

 

        static MyAddIn()

        {

            m_Instance = new MyAddIn();

        }

 

        public MyAddIn() {}

 

        public static MyAddIn Instance { get { return m_Instance; } }

}

b.     Implement the AutoOpen and AutoClose methods.  These methods are invoked when your Addin is first installed and then un-installed.

c.      Implement new OnActivate and OnDeactivate methods.  In the former, get a reference to the root Application object from Excel DNA and connect to a Net Office Application wrapper, and voila, you now have programmatic access to the Excel object and event model through a platform independent API.

 

        internal void OnActivate()

        {

            if (m_ComApplication == null)

            {

                m_ComApplication = ExcelDnaUtil.Application;

            }

 

            if (m_ExcelApplication == null)

            {

// glue NetOffice + Excel DNA!

                m_ExcelApplication = new Application(null, m_ComApplication);

            }

        }

 

        internal void OnDeactivate()

        {

            Dispose();

            GC.Collect();

            GC.WaitForPendingFinalizers();

            GC.Collect();

            GC.WaitForPendingFinalizers();

        }

 

d.     Implement the Dispose method and release references to the root Net Office and Excel DNA application objects.  This will prevent memory leaks and eliminate the “phantom” Excel process after your program exits.

 

        public void Dispose()

        {

            Dispose(true);

        }

 

        public void Dispose(bool disposing)

        {

            try

            {

                if (m_Disposed)

                {

                    if (disposing)

                    {

                        if (m_ExcelApplication != null)

                        {

                            m_ExcelApplication.Quit();

                            m_ExcelApplication.Dispose();

                        }

                    }

 

                    if (m_ComApplication != null)

                    {

                        Marshal.ReleaseComObject(m_ComApplication);

                    }

                }

            }

            catch { }

            finally

            {

                m_Disposed = true;

                m_ExcelApplication = null;

                m_ComApplication = null;

            }

        }

 

4.     Define a class that either extends ExcelDna.Integration.CustomUI.ExcelRibbon or ExcelDna.Integration.ComAddin.

a.     Mark the class with several System.Runtime.InteroperServices attributes.

                                               i.          Mark ComVisible as true.

                                              ii.          Set the ProgId as some fixed string value.

b.     Override the OnConnection and OnDisconnection methods and invoke the Addin’s OnActivate and OnDeactivate methods through the singleton.

 

    [ComVisible(true)]

    [ClassInterface(ClassInterfaceType.AutoDispatch)]

    [ProgId("Athena.Excel.Sample.MyAddIn")]

    public class MyRibbon : ExcelRibbon

    {

        private MyVBAFacade vbaProxy = new MyVBAFacade();

 

      

        public override void OnConnection(object comApp, ext_ConnectMode ConnectMode, object objAddin, ref Array custom)

        {

            base.OnConnection(comApp, ConnectMode, objAddin, ref custom);

 

            if (objAddin != null)

            {

                dynamic excelAddin = objAddin;

                excelAddin.Object = vbaProxy;

            }

 

            MyAddin.Instance.OnActivate();

        }

 

        public override void OnDisconnection(ext_DisconnectMode RemoveMode, ref Array custom)

        {

            base.OnDisconnection(RemoveMode, ref custom);

 

            MyAddin.Instance.OnDeactivate();

        }

 

        public void OnHello(IRibbonControl control)

        {

            MessageBox.Show("Hello, world!");

        }

    }

}

 

5.     Use the Addin.Instance.Application reference to interact with Excel API through Net Office in a platform independent manner that works across Microsoft Office 2000-2016.  Example scenarios include the following:

a.     Register for Excel events such as file open/close and macro re-calculations.

b.     Manipulate Excel cell ranges in response to user events and custom functions.

c.      Perform context specific activity based on the active Excel workbook or cell.

d.     Prompt the user for input from the Excel range reference dialog box.

e.     Set the Excel status bar.

Thursday, December 15, 2016

Principles and Practices | SDLC

This article summarizes some of the core principles, best practices, and my common sense for governing the Software Development Life Cycle.  These SDLC ideas are applicable for both products and projects.  I look forward to discussing these ideas with the wider community.

 

Category

Principles and Practices

Feasibility

Identify the product/project investors, business sponsor, and key stakeholders.

Establish economic model with costs and benefits spanning average and best/worst cases.

Write a clear mission and vision statement.

Requirements

Document requirements for Minimum Viable Product (MVP).

Document user stories, data model, 3rd party constraints, as well as the why, not just the what.

Prototype UI and MVP concepts; get feedback from key users and domain experts.

Project Management

Define delivery of the project or product into large milestones and smaller iterations.

Use 3-4 weeks per iteration and sprint. Plan sprints as a team.

Schedule daily scrums and weekly meeting to review progress, obstacles, and scope.

Define change control process that connects stakeholders with cost/benefit tradeoffs of change.

Create work breakdown structure; define and sequence activities.

Get labor and material cost estimates from engineering - not management or marketing.

Establish a single point of accountability for technology, process, and business matters.

Plan and distribute communications.  Manage stakeholder expectations.  Report performance.

Identify top 10 project/product risks.  Monitor and communicate risk assessment throughout.

Identify core team of 3-5 people (e.g. size of large pizza) including business, engineering, and QA.

Publish planning documents for managing costs, risks, QA, team, procurement, and requirements.

Architecture

Define major building blocks including their responsibilities and interfaces (Miller’s Law: < 9).

Visualize the solution architecture for shared understanding using UML diagrams.

Design

Single Responsibility - components should have only 1 reason to change and exist.

Encapsulation - share only what is needed; hide the details.

Cohesion - gather and release related components, functions, and data together.

Dependency Inversion - depend on abstractions, not concretions to manage change.

Simplicity - make components/systems as simple as possible; minimize complexity.

Stability - use stable abstractions, dependencies, and interfaces.

Consistency - use consistent naming conventions, code styles, etc.

DRY - Do not Repeat Yourself.

YAGNI - You Ain’t Gonna Need It.  Things Change. Plans change. Avoid gold plating.

KISS - Keep It as Simple as poSsible.  But no simpler.

Refactor during maintenance iterations.  Design does not end in one phase.

Consider domain specific languages to represent flexible logic and rules.

Consider reflection, attribute-based programming for dynamic feature surfaces.

Consider microservices to reduce monolithic system into modular components.

Consider concurrency and scaling to multiple processors and machines.

Consider portability and migration to other platform/infrastructure environments.

Consider localization and internationalization.

Security

How is the solution secured?  Consider users, data, and system itself.

How are users authenticated and authorized?

Does data need to be encrypted in storage and transmission?

Grant minimum privileges to users and service accounts.

Use enterprise identity management systems (e.g. Active Directory, ) that federate.

Avoid inventing your own user/group/password management and encryption systems.

Database

Is ACID necessary?  Then perhaps SQL/RDBMS.  Or does scalability suggest NoSQL?

Visualize logical entities and their relationships using diagrams.

Use primary keys and foreign keys for referential integrity.

Use default and check constraints for column data integrity.

Use indexes for large tables (1M+) and partitions for larger data sets (1B+). 

Use NoSQL, document oriented databases for very large data sets (1T+).

Avoid cursors, triggers, and dynamic SQL.  Use them sparingly and consciously.

Secure access through groups and roles.

Plan for growth and disaster recovery.  Stress test for performance and reliability.

Artifact Control

Manage source code, binary references, and documentation using Version Control System.

Prefer GIT; TFS and Subversion are OK though.

Secure access to artifacts through groups and roles.

Plan for several people working on system over time, perhaps at same time.

Source code is not owned by one person; it is collectively owned by the team.

Quality Control

Code defensively. Check parameters. Use assertions. No broken windows.

Are there tests for Correctness? Integration? Performance?  Security?  UX?

Define automated unit tests such as MSTest, NUnit, jUnit, or RUnit.

Assign manual beta testing driven by checklists to independent group of power users.

Review designs before coding. 

Review code before production deployment.

Devops

Where will the solution run? desktops?  Mobile?  On Prem?  Cloud?

How will you deliver system updates and changes over time?

Maintain separate environments for Development, Test, and Production.

How do you log system activity especially errors and end-user usage?

Automate builds and deployments using VSTS, Chef, Octopus, PowerShell, bash, etc.

Support

How will users contact you regarding features and defects?

Track past, present, and future work in an online backlog

Use JIRA, VSTS, or a similar system.   Avoid spreadsheets and email.

Customers

Delight users. Under promise and over deliver.

Prioritize system feedback from power users who influence others and are passionate.

Key users should be part of project and product teams and available - onsite, daily.

Peopleware

Align team member interests with work assignments.

Create workspaces that support thoughtful, focused, un-interrupted work.  Avoid open offices.

Establish accountability and responsibility for decisions by making people put skin in the game.

Take time to celebrate achieving milestones and performing post-mortems.

Metaphors

Cone of uncertainty - explore risk areas and learn to reduce uncertainty over lifecycle.

Triangle of constraints - cost, scope, and quality. Pick 2 out of 3; it’s tough to have it all.

Tracer bullets - use prototypes and feature spikes to find the moving target in the dark.

Microstones - track progress on smaller units work to keep team focused within milestones.

Proverbs

Work with users to think like a user.

Don’t gather requirements like scattered seeds; dig for them like buried treasure.

You will fail, if there is no plan, no one following plan, and no change to plans when things change.

Abstractions live longer than details.

Test early, often, and automatically.

Provide options; don’t give lame excuses.

The word of someone else’s mouth sells better than opening yours.

Good architectures with conceptual integrity should manage complexity and change.

Do not assume something to be true; try to prove it.

Sign your work; take pride in quality and craftsmanship.

If you found it painful more than twice, then automate it.

The cost of “quick and dirty” often lasts longer and gets dirtier than the quick win’s benefit.

The best and the perfect are the enemy of the good.

You don’t make money by developing software, but by delivering and selling software.

Leverage 80/20 rule - finish 80% of requirements before design and 80% of design before coding.

 

References

       Feasibility/Requirements

       Lean Enterprise by Humble, Molesky, and O’Reilly

       Design/Construction

       Extreme Programming by Kent Beck

       Pragmatic Programmer by Andrew Hunt and Dave Thomas

       Code Complete by Steve McConnell

       Design Patterns by GoF

       DevOps

       Release It by Michael Nygard

       Project Management

       Essential Scrum by Ken Rubin

       PMBOK by Project Management Institute

       SDLC

       Software Project Survival Guide by Steve McConnell

       Peopleware by Tom DeMarco and Tim Lister

       12 Steps to Great Code by Joel Spolsky

       Principles and Practices of SDLC

Sunday, December 4, 2016

Excel Add-in Development | Framework Criteria

Although the desktop may be slowly dying as phones, tablets, and other devices encroach; many businesses and consumers remain loyal fans and active users of Microsoft Office and Excel. According to Microsoft, there were 1.2 Billion users of Office as of March 2016.  Excel’s core capabilities include a flexible tabular data representation indexed by rows and columns, a charting module, a catalog of built-in functions, and a rich programming model using VBA.  For heavy, data intensive commercial and academic users of Excel, there is a need to extend Excel’s core capabilities with custom calculations, new data sources, real-time refresh, and embedded GUI’s.

In this series, I will be covering several topics.

  • Excel Addin Framework Evaluation Criteria
  • UDF
  • Embedded GUI
  • RTD
  • Portability
  • Deployment
  • Troubleshooting

There are several criteria when evaluating which Excel addin framework to use.

 

Criteria

VSTO

Add-in Express

Excel DNA

PyXLL

Platform

Microsoft COM, VSTO, C/C++, .NET, VBA

.NET

.NET, VBA

Python v2.3-3.6

Price

MSDN

Commercial.

$500-900/dev/yr

Free

Commercial.

$100-200/dev/yr

Portability

No. PIA version dependency.

Has portability layer for O2000-2016.

Use with Net Office to support O2000-2016.

Supports O2000-2016.

Open Source

No

No

Yes

Yes

Support

Yes

Yes.

Yes

Yes

UDF

Yes. Native

Yes

Yes

Yes

Ribbon

Yes. Native

Yes

Yes

Yes

RTD

Yes. Native

Yes

Yes

Yes

GUI

Yes. Native

Yes

Yes

No

Office Breadth

Yes. Native

Supports Excel, Word, and Outlook

Only Excel

Only Excel

History

1997

1998

2005

2010

If you are looking to build an Excel product suite or family of add-ins, then I strongly encourage you to examine Add-in Express or Excel DNA.  Both are fully featured and have broad language support (unlike PyXLL), and both are much easier to use than the native Microsoft VSTO libraries.  Both have active user communities with 1000’s of developers each.  

If you intend to support other Office products such as Word or Outlook and are also not comfortable with using open source software, then choosing Add-in Express makes wise sense from a long-term maintenance, TCO perspective.  

If you are budget conscious and are developing a PoC or MVP for just Excel, then choosing Excel DNA and Net Office is a solid starting point that you can build upon.

If you intend to develop Excel add-ins which use native, platform features that are idiosyncratic to a specific version of Office (which I do NOT recommend unless it offers durable competitive advantage to your customers), then use VSTO.

Saturday, December 3, 2016

Book Summary | Scalability Rules

This article is a reboot of my blog.  Posts will consist of book reviews, informal musings on economics, history, and software, as well as more formal, didactic articles on technology.

Here is a book summary and review of Scalability Rules by Abbott and Fisher.  Practical.  Concise.  Packed with good advice on design, infrastructure, and organizational processes.  The latest edition includes stories that give context to the rules.  Highly recommended.


Category
Rule
Design
Do not over engineer the solution.
Design scale into solutions: design @ 20X, implement @ 3X, deploy @ 2X
Avoid single point of failure.
Avoid sequencing components and systems in synchronous series.
Strive for statelessness.  Watch out for server affinity.
Communicate asynchronously as much as possible (prefer pub-sub to RPC).
Design your application to be monitored.
Reduce
Reduce object usage including DB, DNS, sockets, images, CPU, MEM, DISK, etc
Cache
Cache static, infrequently changing information and objects appropriately.
Use web page cache, image cache, application reference data cache, etc.
Scale
Design to clone things on commodity hardware and network load balancer.
Design to split different and similar things.
Design your solution to scale out.  Use AKF scale cube.
Scale out your data centers.
Design to leverage the cloud, but be wary of scaling through 3rd party.  Details matter.
Platform
Use database appropriately.  Need ACID (Atomic, Consistent, Isolated, Durable)?
Use firewalls appropriately.
Use log files actively.
Do not double check your work… do not re-read your data after write.
Stop redirecting traffic.
Leverage Content Distribution Networks (CDN).
Use expire headers in HTTP requests to reduce duplicates for static data.
Purge, archive, and cost justify storage.
Process
Test.  Measure.  Rinse.  Repeat.
Learn aggressively and especially from failure through team discussion.
Do not rely on QA to find mistakes.
Design for rollback of code.
Be aware of costly relationships and dependencies (networks, databases, etc).
Database
Understand object relationships.
Use the correct database lock.
Avoid multi-phase commit.
Avoid select for update.
Avoid select *.
Separate business intelligence from transaction processing.
General
Do not do anything more than once.
Do not do anything that is unnecessary.
Get as much as in one request as possible.