I don't know about “fast” delegates but as we're using delegates in our engine as well and everything needs to be C99 standard compliant, our delegates are working with every C++ standard and have “only” 1 indirection which makes them slower than usual function calls. This is from my perspective, the most clean and fast way of doing it.
We differentiate between 2 delegate implementations, the static one for everything which isn't a class member function pointer and a dynamic one which can be used for both, class member function and static calling convention.
The static one doesn't even has an indirection as we don't need to cover dynamic calls requireing a this-pointer as well, so I'll show the dynamic one here:
/**
A dynamic typed struct providing anonymous calling context
*/
template<typename ret do_if(ORDER, _separator) variadic_decl(typename Args, ORDER)> struct DynamicDelegate<ret(variadic_decl(Args, ORDER))>
{
public:
typedef ret ReturnValue;
typedef ret (*FunctionPointer) (variadic_decl(Args, ORDER));
typedef ret (*ContextPointer) (void* do_if(ORDER, _separator) variadic_decl(Args, ORDER));
/**
A pointer to the internal call context
*/
force_inline ContextPointer Context() { return context; }
/**
An instance pointer when initialized to a member function, null_ptr otherwise
*/
force_inline void* Target() { return target; }
/**
Copy constructor
*/
force_inline DynamicDelegate(DynamicDelegate const& delegate) : context(delegate.context), target(delegate.target)
{ }
/**
Default constructor
*/
force_inline DynamicDelegate() : context(se_null), target(se_null)
{ }
/**
Class constructor initializes this context with given values
*/
force_inline DynamicDelegate(ContextPointer context, void* target) : context(context), target(target)
{ }
/**
Class destructor
*/
force_inline ~DynamicDelegate()
{ }
force_inline DynamicDelegate<ret(variadic_decl(Args, ORDER))>& operator=(DynamicDelegate<ret(variadic_decl(Args, ORDER))> const& delegate)
{
Bind(delegate.context, delegate.target);
return *this;
}
force_inline operator bool() const { return context != se_null; }
force_inline bool operator!() const { return !(operator bool()); }
force_inline ret operator()(variadic_args(Args, a, ORDER)) const
{
return Invoke(variadic_decl(a, ORDER));
}
/**
Binds the delegate to a new target
*/
force_inline void Bind(ContextPointer ctx, void* instance)
{
context = ctx;
target = instance;
}
/**
Binds the delegate to a new target
*/
force_inline void Bind(ContextPointer ctx)
{
Bind(ctx, se_null);
}
/**
Calls the function this context is bound to
*/
force_inline ret Invoke(variadic_args(Args, a, ORDER)) const
{
return context(target do_if(ORDER, _separator) variadic_decl(a, ORDER));
}
/**
Returns the parameter count of this delegate signature
*/
force_inline static int ParameterCount() { return ORDER; }
private:
ContextPointer context;
void* target;
};
We expect two function pointers, one points to “the real” function we want to invoke and the other is the proxy call. The proxy call will be invoked together with a third pointer, the object instance required to call the function if it is a class member function. On a static function, this value is zero.
The variadic stuff is a set of macros which generate template and function arguments depending on the ORDER value specified. We generate overloads of this template for 0 up to 10 arguments in the target function pointer.
The real “magic” happens in the call proxies, we have different proxy implementations which can be used:
/**
Class Member Context Utility
*/
template<typename ret do_if(ORDER, _separator) variadic_decl(typename Args, ORDER)> struct InstanceCallContext<ret (variadic_decl(Args, ORDER))>
{
public:
/**
Provides a class member function call context to use in dynamic delegate
*/
template<class T, ret (T::*type) (variadic_decl(Args, ORDER))> static force_inline ret Functor(void* target do_if(ORDER, _separator) variadic_args(Args, a, ORDER))
{
T* ptr = static_cast<T*>(target);
return (ptr->*type)(variadic_decl(a, ORDER));
}
/**
Provides an anonymous class member function call context to use for dynamic calling
*/
template<class T, ret (T::*type) (variadic_decl(Args, ORDER))> static force_inline void AnonymousFunctor(void* target, void** args)
{
T* ptr = static_cast<T*>(target);
*reinterpret_cast<typename SE::TypeTraits::Const::Remove<typename SE::TypeTraits::Reference::Remove<ret>::Result>::Result*>(args[ORDER]) = (ptr->*type)(variadic_deduce(Args, args, ORDER));
(void)args;
}
/**
Provides a const class member function call context to use in dynamic delegate
*/
template<class T, ret (T::*type) (variadic_decl(Args, ORDER)) const> static force_inline ret ConstFunctor(void* target do_if(ORDER, _separator) variadic_args(Args, a, ORDER))
{
T* ptr = static_cast<T*>(target);
return (ptr->*type)(variadic_decl(a, ORDER));
}
/**
Provides an anonymous const class member function call context to use for dynamic calling
*/
template<class T, ret (T::*type) (variadic_decl(Args, ORDER)) const> static force_inline void AnonymousConstFunctor(void* target, void** args)
{
T* ptr = static_cast<T*>(target);
*reinterpret_cast<typename SE::TypeTraits::Const::Remove<typename SE::TypeTraits::Reference::Remove<ret>::Result>::Result*>(args[ORDER]) = (ptr->*type)(variadic_deduce(Args, args, ORDER));
(void)args;
}
};
You see, everything it does is to cast the instance argument to the class type and simply calls the member function along the instance and returns the result. The static one looks similar:
/**
Static Context Utility
*/
template<typename ret do_if(ORDER, _separator) variadic_decl(typename Args, ORDER)> struct StaticCallContext<ret (variadic_decl(Args, ORDER))>
{
public:
typedef ret (*FunctionPointer) (variadic_decl(Args, ORDER));
/**
Provides a static function call context to use in dynamic delegate
*/
template<FunctionPointer type> static force_inline ret Functor(void* target do_if(ORDER, _separator) variadic_args(Args, a, ORDER))
{
(void)target;
return type(variadic_decl(a, ORDER));
}
/**
Provides an anonymous static function call context to use for dynamic calling
*/
template<FunctionPointer type> static force_inline void AnonymousFunctor(void* target, void** args)
{
*reinterpret_cast<typename SE::TypeTraits::Const::Remove<typename SE::TypeTraits::Reference::Remove<ret>::Result>::Result*>(args[ORDER]) = type(variadic_deduce(Args, args, ORDER));
(void)target;
(void)args;
}
};
We also have one of those for member access (e.g. variable/property getter/setter) but that's negotiable for this.
The usage is btw as simple as
DynamicDelegate<void (const char*)> log;
Console instance;
log.Bind(&InstanceCallContext<void (const char*)>::Functor<Console, Console::WriteLine>, instance);
Our current event system is a template class as well, we don't use a single manager for it but have typed events depending on the event's data. For example we have an input event which is invoked from the input system whenever data arrives
Events<InputData>::Add(myInputCallback);
Events<InputData>::Invoke(...);
Events<InputData>::Remove(myInputCallback);
Operator overloads also exist for getting somehow the luxury of the C# events syntax; += callback to add and -= callback to remove.
The event system itself is just collecting event data and does nothing except bookkeeping. Events are fired from our thread pool when we register the Events<Args>::Process function. This is running asynchronously on whatever thread is currently available.
The inner function performs some important operations before callbacks are iterated. First a spin lock is acquired which protects the collection of subscribers, then subscribers are cloned to another list also present in the event manager's internal data. We're doing this because subscribers can unsubscribe from the event while processing the event and cause a data mismatch in the subscriber list. Then the list of collected event data is iterated and every callback is called for each item stored. This is spin locked as well but only if accessing the next item, this way some threads can interrupt the dispatch and those events are processed in the same run as well. This isn't the case for subscibers perhaps.
This is a quiet naive but functional approach, but as we're using the reactive programming pattern really extensively in our C# code, I'm thinking about adding this to the engine and replace the old event system as well. Reactive programming has some advantages, since you're working with streams and can chain event streams to each other. However, I have to think about this even more since our current event system only fires whenever a new frame is created and RC is an on-demand approach